An introduction to finite element methods for inverse coefficient problems in elliptic PDEs

Several novel imaging and non-destructive testing technologies are based on reconstructing the spatially dependent coefficient in an elliptic partial differential equation from measurements of its solution(s). In practical applications, the unknown coefficient is often assumed to be piecewise constant on a given pixel partition (corresponding to the desired resolution), and only finitely many measurements can be made. This leads to the problem of inverting a finite-dimensional non-linear forward operator $\mathcal F:\ \mathcal D(\mathcal F)\subseteq \mathbb R^n\to \mathbb R^m$, where evaluating $\mathcal F$ requires one or several PDE solutions. Numerical inversion methods require the implementation of this forward operator and its Jacobian. We show how to efficiently implement both using a standard FEM package and prove convergence of the FEM approximations against their true-solution counterparts. We present simple example codes for Comsol with the Matlab Livelink package, and numerically demonstrate the challenges that arise from non-uniqueness, non-linearity and instability issues. We also discuss monotonicity and convexity properties of the forward operator that arise for symmetric measurement settings.

Introduction to FEM for inverse coefficient problems in elliptic PDEs

Stationary diffusion
We consider the stationary diffusion equation with homogeneous Dirichlet boundary condition u| ∂Ω = 0 in a Lipschitz bounded domain Ω ⊂ R d , d ∈ N. For u ∈ H 1 (Ω), σ ∈ L ∞ + (Ω) and g ∈ L 2 (Ω) the equation is equivalent to finding u ∈ H 1 0 (Ω) witĥ Ω σ∇u · ∇v dx =ˆΩ gv dx for all v ∈ H 1 0 (Ω), (2) and unique solvability follows from the Lax-Milgram theorem. We are interested in the inverse coefficient problem of determining the diffusivity coefficient σ in (1) from measurements of the solution for one or several source terms g, cf. [9] for an application in groundwater filtration. In practical applications with finitely many measurements, it is natural to only aim for a certain pixel-based resolution and therefore assume that σ is piecewise constant with respect to a partition Ω = n i=1 P i , i.e.
where the pixels P i ⊆ Ω are assumed to be measurable subsets. The left image in figure 1 shows a simple example where the unit square Ω = (0, 1) 2 is divided into 3 × 3 pixels. In the following, with a slight abuse of notation, we write σ = (σ 1 , . . . , σ n ) T ∈ R n for the unknown diffusivity. The source term g in the diffusion model (1) can be identified with the linear functional on the right hand side of the variational formulation (2) l ∈ H −1 (Ω), l(v) :=ˆΩ gv dx, which corresponds to identifying L 2 (Ω) with a subset of H −1 (Ω). Accordingly, we consider excitations in the form of linear functionals. Also, to emphasize that the solution depends on the diffusion coefficient and the excitation, we write u l σ in the following. The left image in figure 2 illustrates the concentration resulting from a constant source term g = χ D , i.e. l(v) =´D v dx, where D = D 2 ∪ D 4 ∪ D 5 ∪ D 7 is a union of four circular subdomains as sketched in the right image of figure 1. The right image in figure 2 shows the corresponding plot for D = D 1 ∪ D 3 ∪ D 6 ∪ D 8 . Both images show the solution of (1) with constant diffusion coefficient σ = 1.
Natural models for measuring the solution of (1) also yield to linear functionals. Measuring the total concentration in one of the circular subdomains D j corresponds to measuring r(u) :=´D j u dx. Hence, the inverse problem of determining finitely many information about the diffusivity coefficient from finitely many measurements of the concentration (possibly but not necessarily resulting from different excitations) leads to the finite-dimensional inverse problem to and u Likewise, measuring the voltage between the k 1 -th and the k 2 -th electrode corresponds to measuring the linear functional of the solution (u, U ) ∈ H generated by some current pattern. Hence, the problem of determining the interior conductivity with a fixed finite resolution from finitely many voltage-current measurements in EIT (with CEM) leads to the finite-dimensional inverse problem to for all (w, W ) ∈ H, with given l j , r j ∈ H , j = 1, . . . , m, and Clearly, one could also extend this formulation to cover the case of unknown contact impedances.

The true-solution setting
The examples in section 2 lead to inverse problems for a finite-dimensional non-linear forward operator F : R n + → R m , where evaluations of F require solving an infinite-dimensional linear problem (the PDE). In this section, we will first derive some properties of F for the case that it is defined with the true infinite-dimensional PDE solution. The properties of the operator F ≈ F, that is defined with a FEM-approximation of the PDE solution, will be studied in section 4.

The true-solution forward operator and its derivative
We will study problems that appear in the variational formulation of elliptic PDEs with piecewise constant coefficients on a fixed pixel partition, as in the examples in section 2.
Introduction to FEM for inverse coefficient problems in elliptic PDEs 7 The variational setting. Let H be a Hilbert space. We consider the problem of finding u ∈ H that solves where b σ : H ×H → R is a bilinear form, and l ∈ H = L(H, R). b σ is assumed to linearly depend on n parameters σ = (σ 1 , . . . , σ n ) ∈ R n in the following way where b 0 , b i : H × H → R are bounded, symmetric and positive semidefinite bilinear forms. Writing 1 := (1, . . . , 1) T ∈ R n , we also assume that b 1 is bounded and coercive with constants β, C > 0, i.e., Clearly, this yields that for all σ ∈ R n so that b σ is symmetric, bounded and coercive. Here and in the following R n + denotes the set of all σ ∈ R n with σ > 0 and ">" and "≥" are understood elementwise on R n .
The true-solution forward operator. We now characterize the derivative of the solution of (4) with respect to σ. fulfills that, for all σ ∈ R n + and τ ∈ R n , S (σ)τ ∈ H is the unique solution of Also, for r ∈ H , σ ∈ R n + , and τ ∈ R n ,
It is easily checked, that B(σ) is Fréchet differentiable for every σ ∈ R n + , and that its derivative B (σ) ∈ L(R n , L(H, H )) is given by where B i ∈ L(H, H ) is the unique operator fulfilling Since B (σ) does not depend on σ, this also shows that B(σ) is infinitely often Fréchet differentiable with all second and higher derivatives being zero.
Using the derivative of operator inversion and the product and chain rule for the Fréchet derivative, we thus obtain that S(σ) is infinitely often Fréchet differentiable with Moreover, we obtain for all r ∈ H , by using the symmetry of B(σ), which finished the proof.
Introduction to FEM for inverse coefficient problems in elliptic PDEs 9 Corollary 1 Let l, r ∈ H . Then the mapping Proof This follows from Lemma 1.

Convexity and monotonicity for symmetric measurements
A special mathematical structure appears for measurements F l,r , when l and r are taken from the same subset of H , and all combinations are used. In the stationary diffusion example this corresponds to using the same subsets of Ω both for excitations and concentration measurements, in EIT this corresponds to using the same electrodes for voltage and current measurements. Given a set of m ∈ N excitations/measurements {l 1 , . . . , l m } ⊂ H , we combine the measurements into a matrix-valued map F : As before, we write "≥" for the elementwise order on R n . We also write S m ⊆ R m×m for the subset of symmetric m × m-matrices, and " " for the Loewner order on S m , i.e. B A denotes that B − A is positive semi-definite.
Lemma 2 F : R n + → R m×m has the following properties: (a) F is infinitely often differentiable.
and for all σ (1) , σ (2) and, for all t ∈ [0, 1], Proof Corollary 1 shows that each component of F is infinitely often differentiable so that (a) is proven. For the rest of the proof let σ ∈ R n + , g ∈ R m , and set l := m j=1 g j l j . By corollary 1, so that F(c) 0. If g = 0 and l 1 , . . . , l m ∈ H are linearly independent then l = 0, which implies u l σ = 0 and thus g T F(σ)g > 0. Hence, (b) is proven. To prove (c) and (d), we start by using again corollary 1 and obtain Since the bilinear forms b i (·, ·) are positive semi-definite, this implies (6).
Introduction to FEM for inverse coefficient problems in elliptic PDEs 11 4 The FEM setting

The FEM-approximated forward operator and its derivative
The Finite Element Method. The Finite Element Method numerically approximates the solution of (4) by solving it in a finite-dimensional subspace V ⊂ H, e.g. the subspace of continuous, piecewise linear functions on a fixed triangulation. Let Λ 1 , . . . , Λ N denote a basis of V , e.g. the so-called hat functions for linear finite elements. Then the finite-dimensional variational problem is equivalent toũ with λ = (λ j ) N j=1 ∈ R N , and the so-called stiffness matrix and load vector with j-th entry given by l(Λ j ).
It follows from the Lax-Milgram theorem that (10) is uniquely solvable and that B is a symmetric, positive definite (and thus invertible) matrix. Moreover, the Céa-Lemma yields that the FEM approximationũ l σ ∈ V is as good an approximation to the true solution u l σ ∈ H as elements of the finite-dimensional space V can be: where C σ := C max{1, σ 1 , . . . , σ n }, and β σ := β min{1, σ 1 , . . . , σ n } are the continuity and coercivity constants of b σ , cf. (5).
Pixel stiffness matrices. Finite element software packages include triangulation algorithms, assembling routines for the global stiffness matrix B σ and the load vector y l , and efficient solvers for the linear system B σ λ = y l . For our setting where we will also require the pixel stiffness matrices The assembling of B σ is usually done by writing it as a weighted sum of element stiffness matrices. In our setting, it is natural to assume that the pixel partition complies with the FEM triangulation, i.e., that each pixel is a union of triangulation elements. Figure 3 shows a coarser and a finer FEM mesh for where B 1+ei and B 1 denote the global stiffness matrix B σ for σ = 1 + e i and σ = 1, respectively, and e i ∈ R n is the i-th unit vector. Note that this does not require any knowledge of the triangulation details.
The FEM-approximated forward operator. Given l, r ∈ H , we approximate the true measurement F l,r (σ) = r(u l σ ) by F l,r (σ) := r(ũ l σ ), whereũ l σ ∈ V is the FEM-approximation to the true solution u l σ ∈ H, i.e., the solution of (10).

Lemma 3 Let l, r ∈ H . Then
Moreover, F l,r : R n + → R is infinitely often differentiable and its first derivatives fulfill Proof This follows by applying corollary 1 to the Hilbert space V .
From lemma 3, we obtain a simple FEM-based implementation of the forward operator and its derivative.
we have that Proof This follows from lemma 3.
We summarize the consequences of corollary 2 in algorithm 1. Using a FEM package that is capable of solving the considered PDE, and that allows access to the stiffness matrix and the load vector, one can simply implement the FEM-approximated forward operator and all its first derivatives by a few lines of extra code. This calculation merely requires solving two linear systems with the stiffness matrix (which is equivalent to two PDE solutions). Convergence of the FEM-approximated forward operator. The following lemma shows that the FEM-approximated operator and its first derivatives agree with their true-solution counterparts as good as the FEM solution agrees with the true solution. Hence, by the Céa-Lemma (12), F l,r (σ) and ∂ ∂σi F l,r (σ) will be as good an approximation to F l,r (σ) and ∂ ∂σi F l,r (σ) as the true solutions can be approximated by elements of the finite-dimensional space V .

Convexity and monotonicity for symmetric measurements
As in subsection 3.2 we now consider the symmetric measurement case, where l and r are taken from the same subset of H (and all combinations are used). Given a set of m ∈ N excitations/measurements {l 1 , . . . , l m } ⊂ H , we combine the measurements into a matrix-valued map F : The entries of F (σ) and its first derivatives ∂ ∂σi F (σ), i = 1, . . . , n can be calculated as in algorithm 1. Let us stress that this approach is particularly efficient in this symmetric case as it requires only m linear system solutions with the stiffness matrix (i.e., the equivalent of m PDE solutions) for calculating all m 2 entries of F (σ) ∈ R m×m and all nm 2 entries of the n matrices ∂ ∂σi F (σ) ∈ R m×m . As in subsection 3.2, the FEM-approximated forward operator is monotonically non-increasing and convex in the sense of the elementwise order "≥" on R n , and the Loewner order " " on the set of symmetric m × m-matrices.
Lemma 5 F : R n + → R m×m has the following properties: (a) F is infinitely often differentiable.
Proof (a)-(d) follow from applying lemma 2 on the Hilbert space V . (e) was proven in lemma 4.

Numerical examples and inverse problem challenges
In this section, we will show some numerical results for the stationary diffusion example from section 2.1 and demonstrate some major challenges that arise in solving the inverse coefficient problem of recoveringσ ∈ R n from F(σ) ∈ R m , or from a noisy version Y δ ≈ F (σ). The source codes for the following examples (and also for generating figure 1 and 2) are given in the appendix for the reader's reference.

Non-uniqueness
Even for m ≥ n, and a noise-free measurementŶ = F (σ) ∈ R m , it is not clear whether the measurements uniquely determine the unknownσ ∈ R n . To demonstrate this on a simple one-dimensional example, let us consider the stationary diffusion example with 3×3 pixels and circular excitation/measurement subdomains in each boundary pixel as in figure 1. We apply a source term in D 1 in the lower left pixel, and measure the resulting total concentration in D 8 in the top right pixel, so that l = χ 1 ∈ H −1 (Ω) and r = χ 8 ∈ H −1 (Ω), where we write χ j := χ Dj for the ease of notation. We choose σ = 1 in all pixels except P i , and on P i we vary the diffusivity in steps of 0.01 up to 3. Figure 4 shows F l,r (σ) for all i = 1, . . . , 9, in the same order as the pixels, e.g., the lower left image shows F l,r (σ) for σ = (σ 1 , 1, . . . , 1) for varying σ 1 . Intuitively speaking, one can see that rising the diffusivity in the middle pixel increases the measurement since particles can easier diffuse through the middle pixel on their way from the lower left to the top right. Rising the diffusivity in the corner pixels decreases the measurement since particles can easier diffuse to the boundary that is set to zero by the homogeneous Dirichlet condition. In the middle top, left, bottom, and right pixel, rising the diffusivity first increases the measurement since particles can easier find their way from the lower left to the top right. But at some point, this effect is reverted since it also drives particles to the boundary.
This demonstrates that changing a coefficient can have an increasing or decreasing effect on the measurements, and the effect does not have to be monotonous for all parameter values. It also indicates that an exact onedimensional measurementŶ = F l,r (σ) might uniquely determine one parameter inσ in some cases (here: the diffusivity in the middle and corner pixels), but non-uniqueness might occur in other cases (here: in the middle top, left, bottom, and right pixel).
Let us also stress the following point. A single non-symmetric measurement F l,r (σ) with l = r might depend non-monotonously on σ as demonstrated in figure 4. But, by lemma 5, for all l, r ∈ H , the symmetric measurements F l,l (σ), F r,r (σ), and also the matrix-valued measurement are monotonously non-increasing and convex functions of σ. Note that the monotonicity and convexity properties of the matrix-valued measurement hold with respect to the Loewner order even though the individual non-diagonal matrix elements might not have these properties.

Non-linearity and local minima
The inverse problem of recoveringσ ∈ R n from F (σ) ∈ R m could be considered as a non-linear root finding problem and (for n = m) approached with Newton's method which is only known to converge locally. However, in practice one usually takes redundant measurements (i.e., m > n), and, due to measurement or modelling errors, one cannot expect exact data fit. Hence, a common approach to reconstructσ ∈ R n from Y δ ≈ F (σ) ∈ R m is to minimize a residuum functional, e.g., or a sum of a residuum functional together with a regularization term. For our simple 3 × 3-pixel example, the left image in figure 5 shows a contour plot of R(σ) (in a normalized logarithmic scale) for a two-dimensional measurement function F (σ) := F χ1,χ7 (σ) F χ1,χ8 (σ) ∈ R 2 , exact data Y δ = F (σ),σ = 1 1 1 0.5 1 0.5 1 1 1 T and σ = 1 1 1 σ 4 1 σ 6 1 1 1 T . σ 4 and σ 6 are varied in steps of 0.002 up to 0.6. Again, for this choice of unknowns and measurements, we numerically observe non-uniqueness. The results indicate that there is a second pointσ =σ with F (σ) = F (σ). The right image in figure 5 shows a plot of R(σ) (in a normalized nonlogarithmic scale) for varying σ 4 while keeping σ 6 := σ 4 , i.e. the diagonal of the left image in figure 5. It indicates that in this case, one parameter inσ can be uniquely reconstructed from the two-dimensional measurement F (σ). But it also shows that the residuum functional R(σ) possesses a local minimum in a wrong point. Since the curse of dimensionality makes it practically infeasible to numerically find the global minimizer of a non-linear functional in more than a few unknowns, this demonstrates that optimization-based approaches also suffer from only local convergence.

Stability, error estimates and ill-posedness
Even if F (σ) uniquely determinesσ, and if this non-linear problem can be solved without running into a local minimum, the problem might be ill-posed in the sense that σ does not depend stably on F (σ). In that case, small errors in the measurements might lead to large errors in the reconstruction. The error amplification is often quantified by searching for a Lipschitz stability constant L > 0 with It should be stressed though, that such a stability estimate does not immediately yield an error estimate for noisy measurements Y δ ≈ F (σ), since Y δ might not lie in the range of F . To estimate the (relative) error amplification, we calculate the condition number of F (1) for our 3 × 3 example, where F : R 9 → R 64 now depends on all 9 pixel values, and the components of F are given by F l,r with l and r running through all 8 × 8-combinations of χ 1 , . . . , χ 8 , i.e., we use all combinations of circular subdomain for applying source terms and for measuring the solution. Note that, lemma 5 implies that roughly half of the measurements are redundant by symmetry, and that unlike in sections 3.2 and 4.2, we simply write the measurements as a long vector in order to have F (1) ∈ R 9,64 as a matrix.
We repeat the calculation of the condition number of F (1) for analogous settings with n x × n x -pixels, and (4n x − 4) × (4n x − 4) measurements on circular subdomains in the boundary pixels, cf. the left image in figure 6 for the geometry of the 15 × 15-pixel case. The right image in figure 6 shows the condition number as a function of the total number of pixels n 2 x for n x ∈ {2, 3, . . . , 15}.
Our results indicate that the instability of the considered inverse problem grows roughly exponentially with the number of unknowns which is in par with theoretical results on the similar elliptic inverse coefficient problem of EIT [17].

Further reading
There is vast literature on theoretical and numerical inverse coefficient problems, their practical applications, and the regularization of ill-posed problems. Let us single out just a very few results as starting points for further reading that are closely connected to the challenges addressed in this section, and the author's own research. Arguably the most prominent inverse elliptic coefficient problem is the so-called Calderón problem [7,8] with applications in EIT, cf. [1,5] for surveys on EIT and the related field of diffuse optical tomography. For theoretical uniqueness proofs in the infinite-dimensional setting of (intuitively speaking) infinitely many pixels and measurements we refer to [20,10,15]. A survey on solving parameter identification problems for PDEs with a focus on sparsity regularization can be found in [13]. The interplay between instability, regularization and FEM discretization is studied in [14]. A result result on convexification approaches to obtain globally convergent reconstruction algorithms can be found in [16]. Learning-based approaches for inverse coefficient problems and parametrized PDEs are studied in [4,18,6].
Results on uniqueness and Lipschitz stability for finitely many unknowns from infinitely or finitely many measurements have been obtained in, e.g., [3,2,11]. Let us stress that, for most problems, it is still an open question, how many (and which) measurements are required to uniquely determine an unknown PDE coefficient with a given resolution, how to explicitly quantify the error amplification, and how to obtain globally convergent reconstruction algorithms. For a relatively simple, but fully non-linear inverse problem of determining a Robin coefficient in an elliptic PDE, these question were recently answered in [12] by exploiting the convexity and monotonicity structure of symmetric measurements from section 3.2 and 4.2.