Abstract
We consider optimal control of the scalar wave equation where the control enters as a coefficient in the principal part. Adding a total variation penalty allows showing existence of optimal controls, which requires continuity results for the coefficienttosolution mapping for discontinuous coefficients. We additionally consider a socalled multibang penalty that promotes controls taking on values pointwise almost everywhere from a specified discrete set. Under additional assumptions on the data, we derive an improved regularity result for the state, leading to optimality conditions that can be interpreted in an appropriate pointwise fashion. The numerical solution makes use of a stabilized finite element method and a nonlinear primal–dual proximal splitting algorithm.
Introduction
This work is concerned with an optimal control problem for the scalar wave equation where the control enters as the spatially varying coefficient in the principal part. Informally, we consider the problem
where \(y_d\) is a given (desired or observed) state, B is a bounded linear observation operator mapping to the observation space \({\mathcal {O}}\), \(\mathcal R\) is a regularization term, \(0<{\underline{u}} < {\overline{u}}\) are constants, and f, \(y_0\), and \(y_1\) (as well as boundary conditions) are given suitably. A precise statement is deferred to Sect. 2. Such problems occur, e.g., in acoustic tomography for medical imaging [1] and nondestructive testing [2] as well as in seismic inversion [3]. In the latter, the goal is the determination of a “velocity model” (as described by the coefficient u) of the underground in a region of interest from recordings (“seismograms”, modeled by \(y_d\)) of reflected pressure waves generated by sources on or near the surface (entering the equation via f, \(y_0\), \(y_1\), or inhomogeneous boundary conditions). If the region contains multiple different materials like rock, oil, and gas, the velocity model changes rapidly or may even have jumps between material interfaces.
In the stationary case, the question of existence of solutions to problem (1) under only pointwise constraints and regularization has received a tremendous amount of attention. However, it was answered in the negative in [4]; this and subsequent investigations led to the concept of Hconvergence and, more generally, to homogenization theory; see, e.g., [5,6,7,8,9]. The use of regularization terms or constraints involving higherorder differential operators would certainly guarantee existence but contradicts the goal of allowing piecewise continuous controls u. Such considerations suggest the introduction of total variation regularization in addition to pointwise constraints. In this case, existence can be argued. However, this leads to difficulties in deriving necessary optimality conditions since the sum rule of convex analysis can only be applied in the \(L^\infty (\Omega )\) topology, which would lead to (generalized) derivatives that do not admit a pointwise representation. This difficulty can be circumvented by replacing the pointwise constraints by a (differentiable approximation of a) cutoff function applied to the coefficient in the equation and by using improved regularity results for the optimal state that allow extending the Fréchet derivative of the tracking term from \(L^\infty (\Omega )\) to \(L^s(\Omega )\) for \(s<\infty \) sufficiently large. Together, this allows obtaining derivatives and subgradients in \(L^r(\Omega )\) for some \(r>1\), which can be characterized in a suitable pointwise manner. This was carried out in [10], which considered for \({\mathcal {R}}\) a combination of total variation and multibang regularization; the latter is a convex pointwise penalty that promotes controls which take values from a prescribed discrete set (e.g., corresponding to different materials such as rock, oil, and gas), see also [11,12,13].
In the current work, we extend this approach to optimal control and identification of discontinuous coefficients in scalar wave equations by deriving under additional (natural) assumptions on the data the adapted higher regularity results for the wave equation based on elliptic maximal regularity theory [14]; see Assumption 3.1 and Proposition 3.6 below. We also address a suitable discretization of the problem using a stabilized finite element method [15] and its solution by a nonlinear primal–dual proximal splitting method [16,17,18].
Let us briefly comment on related literature. As there is a vast body of work on control and inverse problems for the wave equation, we focus here specifically on the identification of discontinuous (and, in particular, piecewise constant) coefficients. This problem has attracted strong interest over the last few decades, mainly due to its relevance in seismic inversion. Classical works are mainly concerned with the onedimensional setting – as a model for seismic inversion in stratified or layered media – which allows making use of integral transforms to derive explicit “layerstripping” formulas; see, e.g., [19,20,21,22]. Regarding the numerous works on wave speed identification in the multidimensional wave equation for seismic inversion, we only mention exemplarily [23,24,25]; see also further literature cited there. The use of total variation penalties for recovering a piecewise constant wave speed in multiple dimensions has been proposed in, e.g., [26,27,28,29,30], although the earlier works employed a smooth approximation of the total variation to allow the numerical solution by standard approaches for nonlinear PDEconstrained optimization. Finally, joint multibang and total variation regularization of linear inverse problems and its numerical solution by a primal–dual proximal splitting methods were considered in [31]. We also mention that multibang control is related to (but different from) switching controls, where at each instant in time, one and only one from a given set of timedependent controls should be active; see, e.g., [32].
This work is organized as follows. In the next Sect. 2, we give a formal statement of the optimal control problem (1) and recall the relevant definitions and properties of the functional. We then derive in Sect. 3 the results on regularity, stability, and a priori estimates for solutions of the state equation that will be needed in the rest of the paper. In particular, in Proposition 3.6 we show a Groegertype maximal regularity result for the wave equation under additional assumptions on the data. Section 4 is devoted to existence and firstorder necessary optimality conditions for optimal controls, where we use the mentioned maximal regularity result to show that the latter can be interpreted in a pointwise fashion. We then discuss the numerical computation of solutions using a stabilized finite element discretization (see Sect. 5.1) together with a nonlinear primal–dual proximal splitting method (see Sect. 5.2). This approach is illustrated in Sect. 6 for two examples: a transmission setup motivated by acoustic tomography and a reflection setup modeling seismic tomography.
Problem Statement
Let \(\Omega \subset {\mathbb {R}}^d\), \(d \in \{2,3\}\), be a bounded domain with \(C^{2,1}\) regular boundary \(\partial \Omega \) and outer normal \(\nu \). For brevity, we introduce the notation \(H:=L^2(\Omega )\) and \(V:=H^1(\Omega )\) and set \(I:=(0,T)\). Then we consider for \(f\in L^2(I,V)\), \(y_0\in V\), and \(y_1\in H\) the weak solution \(y\in C(\overline{I},V)\cap C^1({\overline{I}},H)\) to
This choice of Neumann boundary conditions corresponds, e.g., for acoustic waves to the situation of reflection at a soundhard obstacle and for elastic waves to the absence of external forces at the boundary (which is a natural setting for seismic imaging via interior sources). We will discuss existence and regularity of solutions to (2) in the following Sect. 3.
The salient point is of course the coefficient u in the principal part, which we want to control on an open subset \(\omega _c\subseteq \Omega \), which is assumed to have a \(C^{2,1}\) regular boundary. For constants \({\underline{u}}, \overline{u}\) with \(0< {\underline{u}}< \overline{u} < \infty \) we define the set of admissible coefficients
and pick a reference coefficient \({\hat{u}}\in {\hat{U}}\). To map a control u defined on \(\omega _c\) to a coefficient defined on \(\Omega \), we introduce the affine bounded extension operator
The set of controls that can be extended to admissible coefficients is then given by
where \(u_{\min }<u_{\max }\) are such that \({\underline{u}} \le \inf _{x\in \omega _c} {\hat{u}}(x) + u_{\min } \le \sup _{x\in \omega _c} {\hat{u}}(x) + u_{\max } \le \overline{u}\). In particular, for \({\hat{u}} \equiv {\underline{u}}\), we have \(u_{\min }=0\) and \(u_{\max } = \overline{u}{\underline{u}}\).
Moreover, we introduce the observation space \({\mathcal {O}}\) which is assumed to be a separable Hilbert space as well as a linear and bounded observation operator \(B\in {\mathbb {L}}(L^2(Q),{\mathcal {O}})\) with adjoint \(B^*\in {\mathbb {L}}({\mathcal {O}}, L^2(Q))\).
We then consider the optimal control problem
where y(u) is a weak solution to (2), G is the multibang penalty from [11, 12], \(\mathrm {TV}\) denotes the total variation, and \(\alpha \) and \(\beta \) are positive constants. In the remainder of this section, we recall the definitions and properties of the total variation and the multibang penalty relevant to the current work.
Total variation We recall, e.g., from [33,34,35] that the space \(BV(\omega _c)\) is given by those functions \(v\in L^1(\omega _c)\) for which the distributional derivative Dv is a Radon measure, i.e.,
The total variation of a function \(v\in BV(\omega _c)\) is then given by
i.e., the total variation (in the sense of measure theory) of the vector measure \(Dv\in {\mathcal {M}}(\omega _c;{\mathbb {R}}^d)=C_0(\omega _c;{\mathbb {R}}^d)^*\). Here, \(\cdot _2\) denotes the Euclidean norm on \({\mathbb {R}}^d\); we thus consider in this work the isotropic total variation. For \(v\in L^1(\omega _c)\setminus BV(\omega _c)\), we set \(\mathrm {TV}(v)=\infty \). It follows that \(BV(\omega _c)\) embeds into \(L^r(\omega _c)\) continuously for every \(r\in [1,\frac{d}{d1}]\) and compactly if \(r < \frac{d}{d1}\); see, e.g., [33, Cor. 3.49 together with Prop. 3.21]. In addition, the total variation is lower semicontinuous with respect to strong convergence in \(L^1(\omega _c)\), i.e., if \(\{u_n\}_{n\in {\mathbb {N}}}\subset BV(\omega _c)\) and \(u_n\rightarrow u\) in \(L^1(\omega _c)\), we have that
see, e.g., [35, Thm. 5.2.1]. Note that this does not imply that \(\mathrm {TV}(u)<\infty \) and hence that \(u\in BV(\omega _c)\) unless \(\{\mathrm {TV}(u_n)\}_{n\in {\mathbb {N}}}\) has a bounded subsequence. From (7), we also deduce that the convex extended realvalued functional \(\mathrm {TV}:L^p(\omega _c)\rightarrow {\mathbb {R}}\cup \{\infty \}\) is weakly lower semicontinuous for any \(p\in [1,\infty ]\).
Multibang penalty Let \(u_{\min } \le u_1<\dots < u_m\le u_{\max }\) be a given set of desired coefficient values. The multibang penalty G is then defined similar to [12], where we have to replace the box constraints \(u(x)\in [u_1,u_m]\) by a linear growth to ensure that G is finite on \(L^r(\omega _c)\), \(r<\infty \). For simplicity, we assume in the following that \(u_1 = u_{\min } = 0\) and \(u_m = u_{\max } = {\overline{u}}  {\underline{u}}\) (i.e., \({\hat{u}} = {\underline{u}}\)) and define
where \(g:{\mathbb {R}}\rightarrow {\mathbb {R}}\) is given by
This definition can be motivated via the convex envelope of \(\delta _{[u_1,u_d]}(t) + \frac{1}{2} t^2\) (where \(\delta \) denotes the indicator function in the sense of convex analysis), see [12]; note however that here (as in [10]) g is defined to be finite for every \(t\in {\mathbb {R}}\), while the convex envelope is only finite for \(t\in [u_1,u_2]\). We also remark that for \(m=2\), this reduces in the current setting to the wellknown sparsity penalty (i.e., \(G(u) = \Vert u \Vert _{L^1(\omega _c)}\) for any \(u\in U\)).
It can be verified easily that g is continuous, convex, and linearly bounded from above and below, i.e.,
Since g is finite (and hence proper), convex, and continuous, the corresponding integral operator \(G:L^r(\omega _c)\rightarrow {\mathbb {R}}\) is finite, convex, and continuous (and hence a fortiori weakly lower semicontinuous) for any \(r\in [1,\infty ]\), see, e.g., [36, Prop. 2.53]. Also, the properties of g imply that

(G1)
\(G(v) > G(0)=0\) for all \(v\in L^1(\omega _c)\setminus \{0\}\),

(G2)
\(\frac{1}{2} u_2 \Vert v \Vert _{L^1(\Omega )}\le G(v) \le u_m \Vert v \Vert _{L^1(\omega _c)}\) for all \(v\in L^1(\omega _c)\).
The State Equation
We first consider the state equation for a fixed coefficient \(u\in {\hat{U}}\) (i.e., defined and uniformly bounded on the full domain \(\Omega \) and satisfying \({\underline{u}}\le u \le \overline{u}\) almost everywhere). For given \(u\in {\hat{U}}\), \(f\in L^2(I,H)\), \(y_0\in V\), and \(y_1\in H\), we call \(y=y(u)\) a (weak) solution to (2) if \(y\in W:= L^2(I,V) \cap W^{1,2}(I,H)\) and
for all \(v\in W\) with \(v(T) = 0\). We then have the following existence and natural regularity result.
Lemma 3.1
For every \(u \in {\hat{U}}\) and \((f, y_0, y_1) \in L^2(I,H) \times V \times H\), there exists a unique (weak) solution \(y = y(u) \in Z := C({\overline{V}}) \cap C^1({\overline{H}})\) to (2) satisfying
for a constant \(C_1\) independent of \((f,y_0, y_1) \in L^2(I,H) \times V \times H\) and \(u \in {\hat{U}}\).
Proof
Except for the estimate on \(\Vert \partial _{tt} y \Vert _{L^2({I}, V^*)}\), the claim follows from [37, Theorem 3.8.2, page 275], where we observe that due to our assumption on \(u\in {\hat{U}}\), the energy is coercive with respect to the seminorm in V; see also [23, Theorem 2.4.5]. The constant \(C_1\) depends on \({\underline{u}}\) and \({\overline{u}}\), but is otherwise independent of \(u\in {\hat{U}}\).
To verify the missing estimate, we use from (10) that
Since \(u \in {\hat{U}}\), we deduce that
We further deduce from the state equation that
with \({\tilde{C}}_1 = C_1+1\). \(\square \)
By the change of variables \(t\mapsto Tt\), we can also apply Lemma 3.1 to the dual problem
for any \(g\in L^2(I,H)\), \(\varphi _0\in V\), \(\varphi _1 \in H\), and any \(v\in W\) with \(v(0)=0\).
Corollary 3.2
For every \(u\in {\hat{U}}\) and \(g\in L^2(I,H)\), \(\varphi _0\in V\), and \(\varphi _1 \in H\), there exists a unique solution \(\varphi \in Z\) to (11) satisfying
Using this result, we can apply an Aubin–Nitsche trick or duality argument to show Lipschitz continuity of \(u\mapsto y(u) \in L^2(I,H)\), which we will need to show differentiability of the tracking term later.
Lemma 3.3
There exists a constant \(L>0\) such that the mapping \(u\mapsto y(u)\) satisfies
Proof
Let \(u_1,u_2\in {\hat{U}}\) be arbitrary and set \(\delta u := u_1u_2\) and \(\delta y:= y(u_1)y(u_2)\). Subtracting the weak equations for \(y(u_1)\) and \(y(u_2)\), we have that \(\delta y \in Z\) satisfies \(\delta y(0) = 0\) and
for all \(v\in W\) with \(v(T) = 0\).
Let now \(g\in L^2(I,H)\) be arbitrary and consider the corresponding solution \(\varphi _g\in W\) of (11) for \(u=u_1\), \(g_0=0\), and \(g_1=0\). Noting that \(v=\varphi _g\) is a valid test function for (12) and \(w=\delta y\) is a valid test function for (11), we obtain that
Using that \(\delta u\in L^\infty (\Omega )\) together with Lemma 3.1 and Corollary 3.2, this implies that
for all \(g\in L^2(I,H)\). Since \(L^2(I,H)\) is a Hilbert spaces, taking the supremum over all \(g\in L^2(I,H)\) yields the claim. \(\square \)
In stronger norms, we only have the following weak continuity result, which will be used repeatedly.
Lemma 3.4
Let \(\{u_n\}_{n\in {\mathbb {N}}}\subset {\hat{U}}\) be a sequence with \(u_n\rightarrow u\) in \(L^r(\Omega )\) for some \(r\in [1,\infty )\). Then \(u\in {\hat{U}}\) and \(y(u_n)\rightharpoonup y(u)\) in \(L^2(I,V)\cap W^{1,2}(I,H)\cap W^{2,2}(I,V^*)\). Furthermore, \(y(u_n)\rightharpoonup y(u)\) in H pointwise for all \(t\in [0,T]\).
Proof
The first assertion follows from the fact that \({\hat{U}}\) is closed in \(L^r(\Omega )\). From \(u_n\in {\hat{U}}\) and Lemma 3.1, the corresponding sequence \(\{y(u_n)\}_{n\in {\mathbb {N}}}\) of solutions to (9) is welldefined and bounded in \(L^2(I,V)\cap W^{1,2}(I,H)\cap W^{2,2}(I,V^*)\). By passing to successive subsequences (which we do not distinguish), we thus obtain that
From (9), we in particular have that
for arbitrary \(\varphi \in W \cap L^2(I,H^3(\Omega ))\) with \(\varphi (T) =0\). Since \(u_n\rightarrow u\) strongly in \(L^r(\Omega )\) and \(u_n,u\in {\hat{U}}\subset L^\infty (\Omega )\), we have for \(r\in [1,2]\)
and thus for all \(r\in [1,\infty )\)
Then we can pass to the limit in the weak formulation to obtain
for all such \(\varphi \). Since \(u\in {\hat{U}}\) and since the set of functions with \(\varphi \in W \cap L^2(I,H^3(\Omega ))\) and \(\varphi (T) =0\) is dense in \(\{\varphi \in W: \varphi (0)=0\}\), the last equation also holds for all \(v\in W\) with \(v(T) =0\). The density can be shown by adapting the density argument of \(C^\infty ({\overline{\Omega }})\) in V; see, e.g., [38, Cor. 9.8].
It remains to show that the limit y satisfies the initial condition \(y(0)=y_0\). First, for each \(v\in H\) we have \((y_n(\cdot ),v)_H \rightharpoonup ({{\bar{y}}}(\cdot ),v)_H\) in \(W^{1,2}(I)\). Hence
due to the compact embedding of \(W^{1,2}(I)\) to \(C({\overline{I}})\). In particular this implies that \((y_n(0),v)_H=(y_0,v)_H =({{\bar{y}}}(0),v)_H\) for all \(v\in H\). Since \(v\in H\) was arbitrary, this implies that \({{\bar{y}}}(0) =y_0\). This implies that \(y=y(u)\), and since the solution of (9) is unique, a subsequence–subsequence argument shows that the full sequence converges weakly to y(u). By a similar argument, \(y_n(t) \rightharpoonup {{\bar{y}}}(t)\) in H for all \(t\in {\overline{I}}\). \(\square \)
Stronger continuity of \(u\mapsto y(u)\) can be shown with respect to the \(L^\infty \) topology for the controls.
Lemma 3.5
Assume that \(f\in L^2(I,V)\) and let \(\{u_n\}_{n\in {\mathbb {N}}}\subset {\hat{U}}\) be a sequence with \(u_n\rightarrow u\) in \(L^\infty (\Omega )\). Then \(y(u_n)\rightarrow y(u)\) in \(L^2(I,V)\cap W^{1,2}(I,H)\).
Proof
First, the embedding \(L^\infty (\Omega )\subset L^r(\Omega )\), \(r\in [1,\infty )\), for bounded \(\Omega \) together with Lemma 3.4 shows that \(y_n:= y(u_n)\rightharpoonup y(u)\) in \(L^2(I,V)\cap W^{1,2}(I,H)\cap W^{2,2}(I,V^*)\).
We now introduce for \(u\in {\hat{U}}\) and \(y:=y(u)\) the energy
By the Lions–Magenes Lemma ([37, Lem. 8.3], cf. also [23, (2.24), p. 24]), we have that
and thus by the fundamental theorem of calculus we find that
We now define for \(v\in V\)
which is an equivalent norm on V for any \(u\in {\hat{U}}\). Subtracting (13) for \({\mathcal {E}}_u\) and \({\mathcal {E}}_{u_n}\) and adding the productive zero then yields for almost every \(t\in I\) that
and hence that
We know from Lemma 3.4 that \(y_n\rightharpoonup y\) in \(L^2(I,V)\cap W^{1,2}(I,H)\cap W^{2,2}(I,V^*)\). Thus the Aubin–Lions Lemma and the compactness of the embeddings \(V\hookrightarrow H\) and \(H\hookrightarrow V^*\) imply that \(y_n \rightarrow y\) in \(L^2(I,H)\cap W^{1,2}(I,V^*)\). Thus we have
Since \(\Vert \cdot \Vert _{V_u}\) is an equivalent norm on V, we have that the normed vector space \(V_u := (V,\Vert \cdot \Vert _{V_u})\) is equivalent to V and hence that \(L^2(I,V_u)\cap W^{1,2}(I,H)\simeq L^2(I,V)\cap W^{1,2}(I,H)\). This implies that \(y_n\rightharpoonup y\) also in \(L^2(I,V_u)\cap W^{1,2}(I,H)\), and together with (15) the Radon–Riesz property of Hilbert spaces implies that \(y_n\rightarrow y\) strongly in \(L^2(I,V_u)\cap W^{1,2}(I,H)\). Appealing again to the equivalence of V and \(V_u\) then yields the claim. \(\square \)
Under additional assumptions, we can show an improved regularity result.
Assumption 3.1
The data satisfy \(f\in L^2(I;V)\) and \((y_0,y_1)\in H^2(\Omega )\times H^1(\Omega )\) with \(\partial _\nu y_0 =0\). Furthermore,

(i)
\({\hat{u}}\) is constant on \(\Omega \setminus \omega _c\) and

(ii)
\(y_0\) is constant on \(\omega _c\) and \(\overline{\omega _c} \subset \Omega \).
Proposition 3.6
Let \(u\in {\hat{U}}\) and Assumption 3.1 hold. Then there exists \(q > 2\) and a constant \({\hat{C}}\) independent of u such that \(y(u) \in {L^\infty (I,W^{1,q}(\Omega ))}\) and for all \(u \in {\hat{U}}\),
Proof
We proceed in two steps.
Step 1 First we assume that additionally
Let \(u\in {\hat{U}}\) and approximate u by \(\{u_n\} \subset C^\infty ({\overline{\Omega }}) \cap {\hat{U}}\) with \(u_n\rightarrow u\) in \(L^r(\Omega )\) for some \(r\in [1,\infty )\) with \(u_n={\hat{u}}\) in \(\Omega \setminus \omega _c\). Such a sequence can be found by first approximating u by \({\tilde{u}}_n \in C^\infty ({\mathbb {R}}^d)\) and \({\tilde{u}}_n = {\hat{u}}\) in \(\Omega \setminus \omega _c\); this sequence in turn is constructed by first introducing an intermediate approximation of functions \({\hat{u}}_n\) with the property that \(\lim _{n\rightarrow \infty } {\hat{u}}_n =u\) in \(L^r(\Omega )\) and \({\hat{u}}_n= {\hat{u}}\) in \(\Omega \setminus {\hat{\omega }}_{c,n}\), where the closure of \({\hat{\omega }}_{c,n}\) is contained in \(\omega _c\) and \(0<\mathrm {dist}(\partial \omega _c, {\hat{\omega }}_{c,n}) \le n^{1}\). Then we use convolution by mollifiers of the functions \({\hat{u}}_n\) to obtain functions \({\tilde{u}}_n\in C^\infty ({\mathbb {R}}^d)\) that satisfy \(\lim _{n\rightarrow \infty } {\tilde{u}}_n =u\) in \(L^r(\Omega )\), see e.g. [39, page 132], and \({\tilde{u}}_n={\hat{u}}\) in \(\Omega \setminus \omega _c\). Next we choose functions \({\underline{\varphi }}^n\) and \({\overline{\varphi }} _n\) in \(C^\infty ({\mathbb {R}})\) with a Lipschitz constant L and such that \({\underline{u}}\le {\underline{\varphi }}^n \), \({\overline{\varphi }}_n\le {\overline{u}}\), and \({\underline{\varphi }}^n(s)\rightarrow \max ({\underline{u}},s)\), \({\overline{\varphi }}_n(s) \rightarrow \min ({\overline{u}},s)\) for all \(s\in {\mathbb {R}}\), see [39, page 125]. We then set \(u_n= {\overline{\varphi }}_n({\underline{\varphi }}^n ({\tilde{u}}_n))\), and estimate
using Lebesgue’s bounded convergence theorem.
We replace u in (2) by \(u_n\). Due to the regularity assumptions (17) and the assumption that \(y_0\) is constant on \(\omega _c\), we have
Together with \({\partial _\nu y_0}= 0\) and \(f \in W^{2,2}(I,V) \subset W^{2,2}(I,H)\), these properties allow applying Theorem 30.3 (with \(k=3\)) and Theorem 30.4 in [40], which guarantee that \(y_n = y(u_n) \in W^{1,2}(I,H^2(\Omega ))\cap W^{2,2}(I,V)\) and \({\partial _\nu y_n}(t) = 0\) on \(\partial \Omega \) for \(t \in \overline{I}\). Then we multiply (2) with \({\partial _t} {\mathrm {div}}(u_n\nabla y_n(t))\) and integrate over \(\Omega \). Integrating by parts on the righthand side and using that \(\partial _\nu \partial _ty_n=0\) on \(\partial \Omega \), we obtain
and thus
Integrating this expression on (0, t), we find for \(t \in (0,T]\) that
Gronwall’s inequality then implies that for each \(t \in (0,T]\),
Since \(y_1 \in H^2(\Omega )\), it follows that \(\{\sqrt{u_n} \nabla y_1\}_{n\in {\mathbb {N}}}\) is bounded in \({\mathbb {L}}^2(\Omega )\). Moreover, we have \({\mathrm {div}}(u_n \nabla y_0)_{\Omega \setminus \omega _c} = {\mathrm {div}}({\hat{u}} \nabla y_0)_{\Omega \setminus \omega _c} = {\hat{u}} \Delta y_0_{\Omega \setminus \omega _c}\) and \({\mathrm {div}}(u_n \nabla y_0)_{\omega _c} =0\). Hence \(\{{\mathrm {div}}(u_n \nabla y_0)\}_{n\in {\mathbb {N}}}\) is bounded in H. We can thus conclude that \(\{{\mathrm {div}}(u_n \nabla y_n)\}_{n\in {\mathbb {N}}}\) is bounded in \(L^\infty (I, H)\).
Our next aim is to obtain \(L^\infty (I,W^{1,q}(\Omega ))\) regularity and boundedness for \(y_n\) for some \(q > 2\). For this purpose, we define for some \(\lambda > 0\)
and note that \(\{g_n\}_{n\in {\mathbb {N}}}\) is bounded in \(L^\infty (I,H)\). Furthermore, Sobolev’s embedding theorem implies that \(W^{1,s'}(\Omega ) \hookrightarrow L^2(\Omega )\) for every \(s' \ge 1\) in case \(d = 2\), and for every \(s' \ge \frac{6}{5}\) in case \(d = 3\). Following the notation of [14], we denote by \(W^{1,s}(\Omega )\) the dual space of \(W^{1,s'}(\Omega )\) with s the conjugate of \(s'\). Then we have \(H \hookrightarrow W^{1,s}(\Omega )\), where \(s \in [1, \infty )\) for \(d = 2\) and \(s \in [1,6]\) for \(d = 3\). It follows that \(\{g_n\}_{n\in {\mathbb {N}}}\) is bounded in \(L^\infty (I,W^{1,s}(\Omega ))\). Considering now (18) (together with homogeneous Neumann boundary conditions) for a.e. \(t\in I\) as an equation for \(y_n(t)\), this implies that there exists some \(q > 2\) such that
where the constant C depends only on \({\underline{u}}\), \({\overline{u}}\) and q, but not on t, see [14, Thm. 1]. Hence \(\{y_n\}_{n\in {\mathbb {N}}}\) is bounded in \(L^\infty (I,W^{1,q}(\Omega ))\). Since \(L^1(I)\) and \(W^{1,q}(\Omega )\) are separable with the latter being reflexive, \(L^\infty (I,W^{1,q}(\Omega ))\) is the dual of a separable space; see, e.g., [41, Thm. 8.18.3]. Hence there exists a subsequence with \(y_n \rightharpoonup ^* {\bar{y}} \in L^\infty (I,W^{1,q}(\Omega ))\).
Finally, from Lemma 3.4 we also have that \(y_n\rightharpoonup y(u)\) in \(L^2(I,V)\cap W^{1,2}(I,H)\) and hence, by uniqueness of y(u), that \(y(u)={{\bar{y}}} \in L^\infty (I,W^{1,q}(\Omega ))\). Using \(\hbox {weak}^*\) semicontinuity of norms (cf., e.g., [38, p. 63]), we can now pass to the limit in (19) to obtain (16), for those \((y_0,y_1,f)\) which satisfy the additional regularity assumption (17).
Step 2 We relax the requirements on the problem data and choose an arbitrary \((y_0,y_1,f)\in X:= H^2(\Omega )\times H^1(\Omega )\times L^2(I;V)\), with \(\partial _\nu y_0 =0\). Then there exists \((y^n_0,y^n_1,f^n)\in H^3(\Omega )\times H^2(\Omega )\times W^{2,2}(I;V)\), with \(\partial _\nu y^n_0 =0\) such that \(\lim _{n\rightarrow \infty }(y^n_0,y^n_1,f^n)= (y_0,y_1,f) \) in X. As this is standard for the second and third component, we only address the first one. Let \(\omega _c^c := \Omega \setminus \overline{\omega _c}\). By assumption, \(\partial \Omega \cap \partial \omega _c= \emptyset \); in addition, \(\Omega \) and \(\omega _c\) are \(C^{2,1}\) domains, and thus \(\omega _c^c\) is a \(C^{2,1}\) domain as well. It is in this step that the \(C^{2,1}\) regularity of the domains is used. Since \(y_0 \in H^2(\Omega )\), we have \(y_0_{\partial \Omega } \in H^{3/2}(\partial \Omega )\). Moreover, \(y_0_{\partial \omega _c}=:y_c\) is constant and \(\partial _\nu y_0_{\partial \omega ^c_c} = \partial _\nu y_0_{\partial \omega _c \cup \partial \Omega } =0\). Let \({\tilde{v}}_n \in H^{5/2}(\partial \Omega )\) be such that \({\tilde{v}}_n \rightarrow y_0_{\partial \Omega }\) in \(H^{3/2}(\partial \Omega )\). Accordingly let \(v_n \in H^3(\omega ^c_c)\) with \(v_n_{\partial \omega _c}=y_c\), \(\partial _\nu v_n _{\partial \omega ^c_c}=0\), \(\partial _{x_i x_j}v_n_{\partial \omega _c}=0\), \(v_n_{\partial \Omega } = {\tilde{v}}_n\), and \(v_n\rightarrow v\) in \(H^2(\omega ^c_c)\), where \(v\in H^2(\omega ^c_c)\) satisfies \(v_{\partial \omega _c}=y_c\), \(\partial _\nu v_{\partial \omega ^c_c}=0\), and \(v_{\partial \Omega } = y_0_{\partial \Omega }\). Denoting by \(v_{n,\mathrm {ext}}\) and \(v_{\mathrm {ext}}\) the extensions of \(v_n\) and v by the constant \(y_c\) on \(\omega _c\), we have \(v_{n,\mathrm {ext}} \rightarrow v_{\mathrm {ext}} \in H^2(\Omega )\), \(\partial _\nu v_{n,\mathrm {ext}}_{\partial \Omega }=0\), \(v_{n,\mathrm {ext}}_{\omega _c}= v_{\mathrm {ext}}_{\omega _c}=y_c\), and \(v_{n,\mathrm {ext}} \in H^3(\Omega ) \), where we use that \(\partial _{x_i x_j}v_n_{\partial \omega _c}=0\). Next, observe that \(y_0v \in H^2_0(\omega ^c_c)\). This implies the existence of functions \(w_n\in C^\infty (\omega _c^c)\) with compact support in \(\omega ^c_c\) such that \(w_n \rightarrow y_0v\) in \(H^2(\omega _c^c)\); see, e.g., [42, pp. 17 and 31]. Denoting the extension by zero to \(\omega _c\) of \(w_n\) by \(w_{n,\mathrm {ext}}\), we have \(w_{n,\mathrm {ext}}\in H^3(\Omega ), w_{n,\mathrm {ext}} \rightarrow y_0v_{\mathrm {ext}}\) in \(H^2(\Omega )\), and \(\partial _\nu w_{n,\mathrm {ext}}=0_{\partial \Omega }\). Finally, the sequence \(y^n_0=v_{n,\mathrm {ext}}+w_{n,\mathrm {ext}}\) defines the desired approximation of \(y_0\) such that \(y^n_0_{\omega _c}=y_c\), \(\partial _\nu y_n_{\partial \Omega }=0\), and \(y^n_0 \rightarrow y_0\) in \(H^2(\Omega )\).
We next define \(y^n\) as the solution to (2) with data \((y_0^n,y_1^n,f^n)\). From Lemma 3.1 and (16) for the smooth data \((y_0^n,y_1^n,f^n)\) we deduce that \(y^n \rightharpoonup y\) in \(L^2(I,V)\cap W^{1,2}(I,H)\) and \(y^n \rightharpoonup ^* y\) in \(L^\infty (I,W^{1,q}(\Omega ))\), where y is the solution of (2) with data \((y_0,y_1,f)\), and
for all n. Passing to the limit as \(n\rightarrow \infty \), we obtain (16) for \((y_0,y_1,f)\in X\) with \(\partial _\nu y_0=0\). \(\square \)
Remark 3.7
If Assumption 3.1 holds, the requirement in Lemma 3.5 on the convergence of \(u_n\) can be relaxed to \(u_n\rightarrow u\) in \(L^{\frac{q}{q2}}(\Omega )\), where q is the exponent from Proposition 3.6. In this case, the first term on the righthand side in (14) can be estimated by Hölder’s inequality as
where we used Proposition 3.6. Then again \(y(u_n) \rightarrow y(u)\) in W.
Remark 3.8
In Assumption 3.1 (ii), the requirement \(\overline{\omega _c} \subset \Omega \) was only used in Step 2 of the proof of Proposition 3.6. It it is not necessary if instead \(y_0\in H^3(\Omega )\) is assumed.
Existence and Optimality Conditions
Deriving useful optimality conditions requires replacing the pointwise control constraints with a differentiable approximation of a cutoff function. We thus introduce the superposition operator
where \(\varphi _\varepsilon :{\mathbb {R}}\rightarrow [u_{\min },u_{\max }]\) is such that \(\Phi _\varepsilon \) is Lipschitz continuous from \(L^r(\omega _c)\rightarrow L^r(\omega _c)\) for every \(r\in [1,\infty ]\) and \(\varepsilon \ge 0\) and Fréchet differentiable from \(L^\infty (\omega _c)\rightarrow L^\infty (\omega _c)\) for \(\varepsilon >0\) (and thus ensuring Fréchet differentiability of the tracking term; see Lemma 4.2 below). The construction of such a \(\varphi _\varepsilon \) and the characterization of the Fréchet derivative of \(\Phi _\varepsilon \) via pointwise a.e. multiplication can be carried out in the same way as in [10, § 2.3].
We then consider for \(\varepsilon \ge 0\) the reduced, unconstrained optimization problem
for
for some \(y_d \in {\mathcal {O}}\) and \(\alpha ,\beta >0\), where \(u \mapsto y(u)\) denotes the solution mapping of (9) introduced in the previous section and \({\hat{E}}\) is the affine extension operator from \(\omega _c\) to \(\Omega \) defined in (4). We point out that the role of \(\varepsilon \) is not that of a smoothing parameter for the optimization problem, which remains nonsmooth for \(\varepsilon >0\) due to the presence of G and \(\mathrm {TV}\); it merely influences the behavior of the cutoff function near the upper and lower values of the pointwise bounds for the coefficient.
Existence of optimal controls now follows analogously to [10, Prop. 3.1].
Proposition 4.1
For every \(\varepsilon \ge 0\), there exists a global minimizer \({{\bar{u}}}\in BV(\omega _c)\cap U\) of \(J_\varepsilon \).
Proof
Since \(J_\varepsilon \) is bounded from below, there exists a minimizing sequence \(\{u_n\}_{n\in {\mathbb {N}}}\subset BV(\omega _c)\). Furthermore, we may assume without loss of generality that there exists a \(C>0\) such that
and hence that \(\{u_n\}_{n\in {\mathbb {N}}}\) is bounded in \(BV(\omega _c)\). By the compact embedding of \(BV(\omega _c)\) into \(L^1(\omega _c)\) for any \(d\in {\mathbb {N}}\), we can thus extract a subsequence, denoted by the same symbol, converging strongly in \(L^1(\omega _c)\) to some \({{\bar{u}}}\in L^1(\omega _c)\). Due to the continuity of \(\Phi _\varepsilon \) as well as of \({\hat{E}}\), we have \({\hat{E}}\Phi _\varepsilon (u_n)\rightarrow {\hat{E}}\Phi _\varepsilon ({{\bar{u}}})\in U\) in \(L^1(\omega _c)\).
Lower semicontinuity of G and \(\mathrm {TV}\) with respect to the strong convergence in \(L^1(\omega _c)\) and the weak convergence \(y({\hat{E}}\Phi _\varepsilon (u_n))\rightharpoonup y({\hat{E}}\Phi _\varepsilon ({{\bar{u}}}))\) in \(L^2(Q)\) from Lemma 3.4 yield that
and thus that \({{\bar{u}}}\in BV(\omega _c)\) is the desired minimizer.
The fact that \({{\bar{u}}}\in U\) then follows by a contraposition argument based on Stampacchia’s Lemma for BV functions and the pointwise definition of G, see [10, Prop. 3.2]. \(\square \)
The convergence of minimizers of (20) as \(\varepsilon \rightarrow 0\) can be shown along the same lines as indicated at the end of [10, § 3].
We now derive firstorder optimality conditions for the solution of (20). To this end, we first show Fréchet differentiability of the tracking term
as in [10, Lem. 4.1] by using for given \(u\in L^\infty (\omega _c)\) and \(y\in W\) the definition of the adjoint equation
for any \(\varphi \in W\) with \(\varphi (0)=0\), which admits a unique solution \(p\in W\) by Lemma 3.1. In the following, we use the regularity of solutions to identify the derivative in \(L^\infty (\omega _c)^*\) with its representation in \(L^1(\omega _c)\), considered as a subset of \(L^\infty (\omega _c)^*\). Since the extension operator \({\hat{E}}\) is affine, we also introduce the corresponding linear extension operator \({\hat{E}}_0 = {\hat{E}}':L^2(\omega _c)\rightarrow L^2(\Omega )\).
Lemma 4.2
For every \(\varepsilon >0\), the mapping \(F_\varepsilon \) defined in (21) is Fréchet differentiable in every \(u\in L^\infty (\omega _c)\), and the Fréchet derivative is given by
where \(y=y({\hat{E}}\Phi _\varepsilon (u))\) is the solution of (9), p is the solution of (22), and \({\hat{E}}_0^*:L^2(\Omega )\rightarrow L^2(\omega _c)\) is the restriction operator.
If Assumption 3.1 holds, \(F_\varepsilon '(u) \in L^{\frac{2q}{2+q}}(\omega _c)\) for the \(q>2\) given in Proposition 3.6.
Proof
We first show directional differentiability. Let \(w, h\in L^\infty (\omega _c)\) and \(\rho >0\) be arbitrary. We define \({\tilde{y}}(w):=y({\hat{\Phi }}_\varepsilon (w))\). We now insert the productive zero \(B{\tilde{y}}(w)B{\tilde{y}}(w)\) in \(F_\varepsilon (w+\rho h)\) and expand the square to obtain
For the first term, we can use Lemma 3.3, the boundedness of B and the Lipschitz continuity of \(\Phi _\varepsilon \) to estimate
For the second term in (24), we introduce the adjoint state p(w) and use the fact that \(\delta y:={\tilde{y}}(w+\rho h){\tilde{y}}(w)\in W\) with \(\delta y(0)=0\). Testing (22) with \(\varphi = \delta y\), and using (9) for \(y={\tilde{y}}(w)\) and \(y={\tilde{y}}(w+\rho h)\), each time with \(v=p(w)\), and inserting a productive zero, we find
By Lemma 3.3 we have that \({\tilde{y}}(w+\rho h)\rightharpoonup {\tilde{y}}(w)\) in \(L^2(I,V)\) as \(\rho \rightarrow 0^+\). Moreover, since \(\varepsilon >0\) we have that \({\hat{E}}\Phi _\varepsilon \) is Frechet differentiable in \(L^\infty (\omega _c)\) with \({\hat{E}}_0\Phi _\varepsilon '(w),{\hat{E}}_0\Phi _\varepsilon '(w+\rho h)\in L^\infty (\Omega )\). Hence, dividing (24) by \(\rho >0\) and passing to the limit implies in combination with (25) that
Since the mapping \(h\mapsto F_\varepsilon '(w;h)\) is linear and bounded, \({\hat{E}}_0^*(\int _0^T\nabla y(w)\cdot \nabla p(w)\,\mathrm {d}t)\,\Phi _\varepsilon '(w)\) is the Gâteaux derivative of \(F_\varepsilon \) at w. Thus, \(F_\varepsilon \) is Gâteaux differentiable in \(L^\infty (\omega _c)\).
It remains to show that this is also a Fréchet derivative. From the above, we have that
and hence that
since \({\tilde{y}}(w+h) \rightarrow {\tilde{y}}(w)\) in \(L^2(I,V)\) by Lemma 3.5.
The regularity follows from \({\tilde{y}}(w),p(w)\in L^2(I,V)\) together with the properties of the norm in Bochner spaces, cf. [43, Cor. V.1]. If Assumption 3.1 holds, Proposition 3.6 yields \({\tilde{y}}(w)\in L^\infty (I,W^{1,q}(\Omega ))\) for some \(q>2\) and hence \({\hat{E}}_0^*\left( \int _0^T\nabla {\tilde{y}}(w)\cdot \nabla p(w)\,\mathrm {d}t\right) \Phi _\varepsilon '(w)\in L^2(I,L^{\frac{2q}{2+q}}(\omega _c))\). \(\square \)
We can now proceed exactly as in [10] to obtain firstorder necessary optimality conditions.
Theorem 4.3
([10, Thm. 4.2]) If \(\varepsilon >0\) and Assumption 3.1 holds, every local minimizer \({{\bar{u}}}\in BV(\omega _c)\) to (20) satisfies
where G and \(\mathrm {TV}\) are considered as extended realvalued convex functionals on \(L^\frac{2q}{q2}(\omega _c)\).
Introducing explicit subgradients for the two subdifferentials, we obtain primal–dual optimality conditions.
Corollary 4.4
([10, Cor. 4.3]) For any local minimizer \({{\bar{u}}}\in BV(\omega _c)\) to (20), there exist \({{\bar{q}}} \in L^\frac{2q}{2+q}(\omega _c)\) and \({\bar{\xi }} \in L^{\frac{2q}{2+q}}(\omega _c)\) satisfying
These conditions can be further interpreted pointwise. First, using the characterization of Lemma 4.2, we can identify the first term in the first equation with the \(L^{1+\delta }(\omega _c)\) function given by
Second, using the characterization of \(\partial G\) from [10, § 2], we have that
The interpretation of the final term is more delicate. Informally, \(\xi (x)\) corresponds to the mean curvature of \({{\bar{u}}}(x)\) (if \({{\bar{u}}}\) is smooth at x) or the signed normal to its jump set (if \({{\bar{u}}}\) has a jump discontinuity across a measurable curve of \(d1\)dimensional Hausdorff measure greater zero). This can be made more precise using the notion of the full trace from [44], see also [45].
Numerical Solution
In this section, we address the numerical solution of (20) using a stabilized spacetime finite element discretization for secondorder hyperbolic equations [15] and a nonlinear primal–dual proximal splitting algorithm [16,17,18]. Since we now consider a finitedimensional optimization problem, we can include the constraint \(u \in U\) directly via the multibang penalty instead of enforcing it inside the state equation. In this and the following section, we will therefore omit \(\Phi _\varepsilon \) from the discretized tracking term (and, with it, \(\varepsilon \) in general) and define the multibang penalty with \(\mathrm {dom}\,g = [u_1,u_d]\) as in [12]; see (34) below.
Discretization
We consider a mesh \({\mathcal {T}}_h\) consisting of a finite set of triangles or tetrahedra T with a mesh size h. Then we introduce the space \(D_h\subset H^1(\Omega )\cap C({\overline{\Omega }})\) of linear finite elements based on the triangulation \({\mathcal {T}}_h\). A basis of this space is given by the standard hat functions \(\varphi _i\) associated with nodes \(x_i\), \(i=1,\ldots , N_h\), of the triangulation \({\mathcal {T}}_h\). Next we discretize the time interval I uniformly by \(0=t_0< \cdots < t_{N_\tau }=T\) and grid size of \(\tau \). Similarly, we define the space \(D_\tau \subset H^1(I)\cap C({\overline{I}})\) of piecewise linear and continuous functions with respect to these grids. Furthermore we consider the hat functions \(e_i\), \(i=0,\ldots ,N_\tau \), with \(e_i(t_l)=\delta _{il}\) which form a basis of \(D_\tau \). We assume that \(\omega _c\) can be represented by the triangulation \({\mathcal {T}}_h\) exactly and introduce the space
Moreover we introduce the space \(C_h^c\) of piecewise constant functions on the triangles in \(\omega _c\). In the following we also identify \(D_h^c\) with \({\mathbb {R}}^{N_c}\) for \(\dim D_h^c=N_c\) and \(C_h^c\) with \({\mathbb {R}}^{M_c}\) for \(\dim C_h^c=M_c\). Finally we define \(\vartheta :=(\tau ,h) >0\) and introduce the stabilization parameter \(\sigma \ge 0\).
Definition 5.1
We call \(y_{\vartheta }\in D_{\vartheta }:=D_h \otimes D_\tau \) a discrete solution of (9) if \(y_{\vartheta }\) satisfies
for all \(v \in D_{\vartheta }\) with \(v(T)=0\) and initial condition \(y_{\vartheta }(0)=S_0 y_0\) defined via
This is a spacetime finite element discretization with piecewise linear elements in space and in time. The additional \(\sigma \)term in (28) serves as a stabilization term, which vanishes for \(\vartheta \rightarrow 0\) and is connected to the error term in the trapezoidal rule for the time integral of the third bilinear form in (28). The stability depends significantly on the value of \(\sigma \) with the method being more stable for larger \(\sigma \); e.g., for \(\sigma \ge 1/4\), the method is unconditionally stable and convergent while for \(0\le \sigma <1/4\), a CFLlike condition has to be satisfied to ensure stability as well as convergence, see [15, Thms. 2.1, 3.1] for homogeneous Dirichlet boundary conditions. At the same time, (28) can be formulated as the following timestepping scheme: Set \(y_h^0=S_0y_0\) and
for all \(\varphi \in D_h\). For \(\sigma =1/4\), this method is equivalent to the implicit, unconditionally stable, and convergent Crank–Nicolson scheme, while for \(\sigma =0\), the method is explicit if the spatial mass matrix is lumped. The main benefit in our context is that this is an adjointconsistent discretization and therefore can be used to obtain a conforming discretization of (6) in a straightforward manner.
Next we introduce the discrete controltoobservation operator \(S_\vartheta :U\cap D_h^c\rightarrow {\mathcal {O}}\) defined by \(u\mapsto By_\vartheta \) where \(y_\vartheta \) is the solution of (28) for the coefficient \(u\in U\cap D_h^c\). Let \(\delta >0\). Then the implicit function theorem implies that \(S_\vartheta \) is Fréchet differentiable on the open subset
This set contains \(U\cap D_h^c\). The implicit function theorem is applicable since the following linearized discrete state equation is wellposed in the variable \(\delta y\in D_\vartheta \) for every \(\delta u_h\in D_h^c\):
for all \(v \in D_{\vartheta }\) with \(v(T)=0\) and initial condition \(\delta y_{\vartheta }(0)=0\) as well as \(y_\vartheta =S_\vartheta (u)\). Thus the derivative of \(S_\vartheta \) at u is given by \(S'(u_h):D_h^c\rightarrow {\mathcal {O}},~\delta u_h\mapsto B\delta y_\vartheta \) where \(\delta y_\vartheta \) solves (30) for \(\delta u_h\). Its adjoint (with respect to the \(L^2(Q)\) and \({\mathcal {O}}\) inner product) is given by
where \(y_\vartheta =S_\vartheta (u)\) and \(p_{\vartheta }\) solves the discrete adjoint equation
for all \(v \in D_{\vartheta }\) with \(v(0)=0\) and initial condition \(p_{\vartheta }(T)=0\), which can be formulated as a timestepping scheme similar to (29).
We now introduce the variables \(y_h\) and \(p_h\) defined by
Thus \((S_\vartheta '(u_h))^*o\) with \(o\in {\mathcal {O}}\) has the representation
where K is given by
with the temporal mass matrix \(M_\tau \) and stiffness matrix \(A_\tau \). For \(\sigma =0\), K is a diagonal matrix.
We now address the discretization of the control costs in the optimization (6). Since a function \(u_h\in D_h^c\) is an element of \(H^1(\omega _c)\) and thus its weak derivatives are piecewise constant on the triangulation of \(\omega _c\), we have
We furthermore approximate the integral in definition of G by the trapezoidal rule to obtain the discrete multibang penalty \(G_h:D_h^c \rightarrow {\overline{{\mathbb {R}}}}\). Using these definitions, we obtain the fully discrete optimization problem
Note that although (33) is discrete, it is still formulate in function spaces. To apply a minimization algorithm, we now reformulate it in terms of the coefficient vectors for the finitedimensional functions. First, \(D_h^c\cap U\) can be identified with the set
through \(u_h=\sum _{x_i\in \overline{\omega _c}}\mathbf{u }_ie_i\). Next we introduce the finitedimensional subspace \({\mathcal {O}}_h=B(D_\vartheta )\) of \({\mathcal {O}}\) and the discrete controltoobservation on \(U_h\) by \(\mathbf{S }_\vartheta :{\mathbb {R}}^{N_c}\rightarrow {\mathbb {R}}^{N_o}\) with \(N_o=\dim ({\mathcal {O}}_h)\) defined by \(S_\vartheta \). Moreover we define the matrix \(M_{{\mathcal {O}}}\in {\mathbb {R}}^{N_o\times N_o}\) by the mapping \((o_1,o_2)\mapsto (o_1,o_2)_{{\mathcal {O}}}\) for \(o_1,o_2\in {\mathcal {O}}_h\). Thus the inner product and the norm of \({\mathcal {O}}\) in \({\mathcal {O}}_h\) can be identified with \((\mathbf{o }_1,\mathbf{o }_2)_{{\mathcal {O}}_h}=\mathbf{o }_1^\top M_{{\mathcal {O}}}\mathbf{o }_2\) and \(\Vert \mathbf{o }\Vert _{{\mathcal {O}}_h}=((\mathbf{o },\mathbf{o })_{{\mathcal {O}}_h})^{1/2}\) for \(\mathbf{o },~\mathbf{o }_1,~\mathbf{o }_2\in {\mathbb {R}}^{N_o}\). We denote the orthogonal projection onto \({\mathcal {O}}_h\) by \(\pi _{{\mathcal {O}}}\). The operator B restricted to \(D_\vartheta \) can be identified with a matrix \(B_h\in {\mathbb {R}}^{N_o\times N_{\vartheta }}\) for \(N_{\vartheta }=\dim (D_\vartheta )\). Thus \(B^*\) can be identified with \(B_h^\top \). With these identifications, the adjoint operator \(S_\vartheta '(u_h)^*\) for \(u_h\in U\cap D_h^c\) acting on \({\mathcal {O}}_h\) can similarly identified with \(\mathbf{S }_\vartheta '(\mathbf{u })^*:{\mathbb {R}}^{N_o}\rightarrow {\mathbb {R}}^{M_c}\) for some \(\mathbf{u }\in U_h\). Moreover we define the matrix \(A_h\in {\mathbb {R}}^{dM_c\times N_c}\) representing the bilinear form \((\nabla u_h,\xi _h)_{L^2(\Omega )^d}\) for \(u_h\in D_h^c\) and \(\xi _h\in (C_h^c)^d\). Thus we have
Finally, the trapezoidal rule in the definition of \(G_h\) can be expressed in the form of a mass lumping scheme, i.e.,
where
is the scalar multibang penalty including the box constraints from [12] and \(d_i=\int _{\omega _c}\varphi _i\,\mathrm {d}x\) are the diagonal entries of the lumped mass matrix, see [46,47,48,49].
Using these notations, we can write (33) equivalently in the form
Primal–Dual Proximal Splitting
To solve (35), we extend the approach in [31] by applying the nonlinear primal–dual proximal splitting method from [16,17,18] together with a lifting trick. For this purpose, we write (35) (omitting the bold notation and the subscripts denoting vectors and discretizations from now on and assuming that \(y_d\in {\mathcal {O}}_h\)) as
Setting
we can apply the nonlinear primal–dual proximal splitting algorithm
for step sizes \(\gamma _F,\gamma _G>0\) satisfying \(\gamma _F\gamma _G\Vert K'(u)^* \Vert <1\). Convergence can be guaranteed under a secondorder type condition for K and possibly further restrictions on the step sizes, whose (very technical) verification is outside the scope of this work. Instead, we restrict the discussion here on deriving the explicit form of (36) in the present setting.
First, we endow \(({\mathbb {R}}^{N_o}\times {\mathbb {R}}^{2M_c})\) with the sum of the inner product induced by \(M_{\mathcal {O}}\) (for \({\mathbb {R}}^{N_o}\)) and the Euclidean inner product (for \({\mathbb {R}}^{2M_c}\)). With respect to this inner product, we obtain the adjoint Fréchet derivative
where \(S'(u)^*\) is the fully discrete operator corresponding to (32) with righthand side \(r\in {\mathcal {O}}\) for the adjoint equation.
The proximal point mapping for the (scaled) multibang penalty can be obtained by straightforward calculation based on a case differentiation in the definition of the subdifferential, see [31, Prop. 3.6]; for the sake of completeness, we give the short derivation here in full. By the definition of the proximal mapping, \(w=\mathrm {prox}_{\gamma g}(v) = ({{\,\mathrm{\mathrm {Id}}\,}}+ \gamma \partial g)^{1}(v)\) holds for any \(v\in {\mathbb {R}}\) if and only if \(v \in \{w\} + \gamma \partial g(w)\). Recalling from [13, § 2] that
we now distinguish the following cases for w:

(i)
\(w=u_1\): In this case,
$$\begin{aligned} v \in \{w\} + \gamma \left( \infty ,\tfrac{1}{2}(u_1+u_2)\right] = \left( \infty ,(1+\tfrac{\gamma }{2}) u_1 + \tfrac{\gamma }{2} u_2\right] . \end{aligned}$$ 
(ii)
\(w\in (u_i,u_{i+1})\) for \(1\le i< m\): In this case,
$$\begin{aligned} v \in \{w\} + \gamma \{\tfrac{1}{2}(u_i + u_{i+1})\}, \end{aligned}$$which first can be solved for w to yield
$$\begin{aligned} w = v  \tfrac{\gamma }{2} (u_i+u_{i+1}); \end{aligned}$$inserting this into \(w\in (u_i,u_{i+1})\) and simplifying then gives
$$\begin{aligned} v \in \left( (1+\tfrac{\gamma }{2})u_i + \tfrac{\gamma }{2} u_{i+1}, \tfrac{\gamma }{2} u_i + (1+\tfrac{\gamma }{2})u_{i+1}\right) . \end{aligned}$$ 
(iii)
\(w = u_i\), \(1<i<m\): Proceeding as in the first case, we obtain
$$\begin{aligned} v \in \left[ \tfrac{\gamma }{2} u_{i1},(1+\tfrac{\gamma }{2})u_i,(1+\tfrac{\gamma }{2}) u_i + \tfrac{\gamma }{2} u_{i+1}\right] . \end{aligned}$$ 
(iv)
\(w=u_m\): Similarly, this implies that
$$\begin{aligned} v \in \left[ \tfrac{\gamma }{2} u_{m1},(1+\tfrac{\gamma }{2})u_m,\infty \right) . \end{aligned}$$
Since this is a complete and disjoint case distinction for \(v\in {\mathbb {R}}\), we obtain the proximal mapping for the scalar penalty g, see Fig. 1. By a standard argument, the proximal point mapping for \(G_h\) is thus given componentwise by
where we have set \(u_{0}=\infty \) and \(u_{m+1}=\infty \) to avoid the need for further case distinctions. (Note that we compute the proximal mapping with respect to the inner product induced by the lumped mass matrix such that the weight \(d_i\) cancels.)
Finally, for \({\mathcal {F}}\), we first compute the Fenchel conjugate on \({\mathbb {R}}^{N_o}\times {\mathbb {R}}^{2M_c}\) (with respect to the same inner product as above) as
to obtain the proximal point mapping (again, with respect to this inner product)
where the projection can be computed elementwise for each \(K\in {\mathcal {T}}\cap \omega _c\) as
With these, (36) becomes the following explicit algorithm:
Note that this requires two solutions of the forward wave equation (as well as one solution of the adjoint equation) in each iteration, since \(y^{k+1}\) is based on the extrapolated vector \({{\bar{u}}}^{k+1}\), while the state vector y required for the computation of \(S'(u^{k+1})^*r^{k+1}\) in the following iteration is based on the original update \(u^{k+1}\).
The iteration is terminated based on the residual norm in an equivalent reformulation of the optimality conditions for (33). Combining the approach of Sect. 4 with standard results from convex analysis (see, e.g, [50, 51]), any local minimizer \({{\bar{u}}}\) of (33) together with the corresponding Lagrange multiplier \({\bar{\psi }}\) and the residual \({{\bar{r}}} := S({{\bar{u}}})y_d\) can be shown to satisfy
For the first equation, which holds in \(U_h\), we measure the residual in the discrete norm induced by the lumped mass matrix as in the definition of \(G_h\). The second equation holds in \({\mathcal {O}}_h\), and hence we measure the residual in the norm induced by the corresponding mass matrix \(M_{\mathcal {O}}\). Finally, the last equation holds in \({\mathbb {R}}^{2N_c}\) so we use the standard Euclidean norm. The iteration is terminated once the sum of these residuals drops below a given tolerance. For the implementation, note that the residual in the first equation for \((u^k,r^k,\psi ^k)\) reduces to \(u^{k}u^{k+1}\). On the other hand, the residual in the second equation requires an additional solution of the state equation since here S is applied to \(u^k\) instead of the extrapolated \({{\bar{u}}}^k\). In practice, we thus do not evaluate the stopping criterion in every iteration.
Numerical Examples
We now illustrate the above presented approach with two numerical examples. The first is a transmission problem (where waves produced by external forcing pass through the control domain before being observed) loosely motivated by acoustic tomography. The second is a reflection problem (where only reflected, not transmitted, waves are observed) that more closely models seismic inversion. The implementation in Python (using DOLFIN [52, 53], which is part of the opensource computing platform FEniCS [54, 55]) used to generate the following results can be downloaded from https://github.com/clason/tvwavecontrol.
A Model Acoustic Tomography Problem
For the first example, we take \(\Omega = (1,1)\times (1,2)\) and \(T=3\) and define the control and observation domains
correspondingly, the observation operator is taken as the restriction operator \(By := y_{{\mathcal {O}}}\) to the observation space \({\mathcal {O}}:= (0,T)\times \omega _o\). The initial conditions are chosen as \((y_0,y_1) = (0,0)\), thus satisfying Assumption 3.1. We now aim to recover a piecewise constant coefficient \(u_e\) with \(u_e(x) \in \{1.0,1.1,1.2,1.3,1.4\}\) almost everywhere; see Fig. 2. Accordingly, we set \({\hat{u}} \equiv 1.0\) and \(u_i = (i1)/10\), \(i=1,\dots 5\) from noisy observations of the state in \({\mathcal {O}}\). These observations are generated using a source term f that is constructed as a linear combination of point sources which act as Ricker wavelets in time, i.e.,
(The number and location of source points as well as the amplitude and frequency of the wavelet are chosen such as to obtain a complex enough wave pattern to recover the lateral and depthwise variations in the coefficient.) The discretization is performed using 64 nodes in each space direction and 128 nodes in time, corresponding to \(h\approx 0.056\) and \(\tau \approx 0.23\). The stabilization constant is set to \(\sigma = 1/4\). The discretized exact solution is then perturbed componentwise by \(10\%\) relative Gaussian noise, i.e., we take
where \(\xi \) is a vector of independent normally distributed random components with mean 0 and variance 1.
We now compute the reconstruction using the algorithm described in Sect. 5.2, comparing the effects of the total variation and the multibang penalty by taking \(\alpha \in \{0,10^{5}\}\) and \(\beta \in \{0,10^{4}\}\). In each case, we set the step sizes to \(\gamma _F = 10^{1}\) and \(\gamma _G = 10^3\) and terminate when the residual norms (evaluated every 10 iterations) drop below \(10^{6}\). Again, these parameters are chosen to achieve a reasonable reconstruction in as few iterations as possible. (A proper parameter choice rule depending on the noise level and the discretization is left for future work.) The results can be seen in Fig. 3. The case of pure multibang regularization (\(\alpha = 10^{5}\) and \(\beta = 0\), 3680 iterations); see Fig. 3a) shows that indeed \(u(x)\in \{u_1,\dots ,u_5\}\) almost everywhere; however, there is a clear lack of regularity of the reconstruction, which is not surprising as the original functionspace problem is not wellposed for \(\beta =0\). In contrast, the reconstruction case of pure \(\mathrm {TV}\) regularization (\(\alpha = 0\) and \(\beta = 10^{4}\), 1100 iterations; see Fig. 3b) shows a much more regular reconstruction that is constant on large regions; however, these constants are not necessarily from the admissible set \(\{u_1,\dots ,u_5\}\). Finally, combining both multibang and total variation regularization (\(\alpha = 10^{5}\) and \(\beta = 10^{4}\), 600 iterations; see Fig. 3c) allows recovering more admissible values at the price of penalizing the magnitude of the coefficient value, which prevents the largest value \(u_5=0.4\) from being attained. It is also noteworthy that in this case the tolerance for the residual norm is reached after significantly fewer iterations.
To illustrate the effects of variation of the desired values \(u_i\) on the reconstruction, we recompute the last example with the same parameters \(\alpha ,\beta \) but 10% increased values, i.e., \(u_i=1.1(i1)/10\), \(i=1,\dots ,5\). The results are shown in Fig. 4, where we repeat the exact coefficient from Fig. 2 with adjusted labels in Fig. 4a for better comparison. As can be seen from Fig. 4b, the reconstruction is similar to that for \(\alpha =0\). In particular, the total variation penalty prevents the misspecified desired values from being enforced strongly. This demonstrates that while misspecified values clearly do not have the same positive influence on the reconstruction, they at least do not have a negative influence.
A Model Seismic Inverse Problem
We next consider an example which is inspired from seismic tomography. We assume that the data is given in the form of a time series of mean values of the reflected waves y over certain spatial regions \(O_i\). Thus we define the observation space \({\mathcal {O}}=L^2(I)^m\) for \(m\in {\mathbb {N}}\) and the observation operator
where the \(O_i\subset \Omega \) are the m spatial observation patches. Furthermore we assume that seismic sources are given by s point sources located on the surface \(\Gamma _s\subset \partial \Omega \) whose magnitudes are time dependent and follow a Ricker wavelet of the form
with \(h,a,t\in {\mathbb {R}}^s\). This leads to the modified state equation
with \((x_k)_{k=1}^s\subset \Gamma _s\), \((f_k)_{k=1}^s\subset L^2(I)\), and \(\delta _{x_k}\) the Dirac measure supported on \(x_k\). In our concrete example, we chose \(\Omega =(1,1)^2\), \(\Gamma _s=(1,1)\times \{1\}\), and \(T=3\). We set \(\Omega _c=(1,1)\times (1,0.7)\). The observation patches are chosen as
The sources are located at \(x=(1+k\cdot 0.1,1)\) with \(k=0,\ldots ,20\). The parameters of the Ricker wavelet are set to \(a_k=2\), \(h_k=5\) and \(t_k=0.1\). The offset \({\hat{u}}\) has the constant value 1. Finally, the exact velocity model is given by
for the constant reference coefficient \({\hat{u}} \equiv 1\), cf. Fig. 5a.
The recorded data for our experiments are generated by solving the state equation with the exact velocity model \(u_e\) resulting in the exact state \(y_e\). Then we set \(y_d=By_e+\delta n\) with \(\delta \in [0,1]\). The function \(n\in L^\infty (I)^m\) is a disturbance which models measurement errors and exterior influences. In our case we use a function of the form
where \(M\in {\mathbb {N}}\), \(\eta _k=\frac{\Vert (By_e)_k\Vert _{L^\infty (I)}}{\Vert r_k\Vert _{L^\infty (I)}}\), and \(m_{i,k},s_{i,k}\) are uniform random numbers in [0, 1]. Here we take \(M=10\).
For the discretization, we take a tensorialbased triangular mesh with \(N_h=129^2\), \(N_\tau =129\), and \(\sigma =1/4\). The relative noise level is \(\delta =0.05\). An appropriate regularization parameter is given by \(\beta = 10^{4}\); for simplicity, we set \(\alpha =0\). The iteration is initialized with \(u_0=0\) and the stepsizes are again chosen as \(\gamma _F=10^{1}\) and \(\gamma _G=10^3\). The iteration is stopped if the absolute residuum is smaller than \(10^{4}\); in this experiment, this was reached after 1068 iterations. Figure 6 shows the exact and noisy observations on \(O_1\), \(O_5\) and \(O_{10}\). At the onset, we note two high spikes (a negative and a positive one) which are caused by the source wave initiated on boundary points \(\Gamma _s\). The remaining oscillations are caused by the reflection waves originating from the discontinuities of \(u_e\) and from the reflecting boundary; only these carry information about the coefficient, which makes the reconstruction challenging. The results are shown in Fig. 5b, where each color map is scaled individually to show more details. We observe that the positions of the discontinuities in \(u_e\) that are close to the observation patches are well approximated in \({{\bar{u}}}\) and that the corresponding interfaces are quite sharp. However, the approximation quality of the discontinuities becomes worse farther away from the observation region. This is caused by the fact that reflected waves from lower sections of the discontinuities are more dispersed than the reflected waves from the upper sections of the discontinuities.
Conclusion
We showed existence of solutions to an optimal control problem for the wave equation with the control entering into the principal part of the operator using total variation regularization and a reformulation of pointwise constraints using a cutoff function. Preferential attainment of a discrete set of control values is incorporated through a multibang penalty. We also derived an improved regularity result for solutions of the wave equation under additional natural assumptions on the data and the control, which (for smooth cutoff functions) allows obtaining necessary optimality conditions that can be interpreted in a suitable pointwise fashion. Finally, we demonstrated that the optimal control problem can be solved numerically using a combination of a stabilized finite element discretization and a nonlinear primal–dual proximal splitting algorithm.
This work can be extended in several directions. Besides applying the proposed approach to more realistic models of acoustic tomography or seismic imaging for practical applications, it would be worthwhile to consider the case of boundary observations of the state [56], which however may lead to an unbounded observation operator B. A further challenging goal would be deriving sufficient secondorder conditions. Such conditions could then be used for obtaining discretization error estimates for the optimal controls or for showing convergence of the nonlinear primal–dual proximal splitting algorithm based on the “threepoint condition” on S from [18].
References
 1.
Beilina, L., Clason, C.: An adaptive hybrid FEM/FDM method for an inverse scattering problem in scanning acoustic microscopy. SIAM J. Sci. Comput. 28(1), 382–402 (2006). https://doi.org/10.1137/050631252
 2.
Krautkrämer, J., Krautkrämer, H.: Ultrasonic Testing of Materials, 4th edn. SpringerVerlag, Berlin Heidelberg (1990). https://doi.org/10.1007/9783662106808
 3.
Tarantola, A.: Inversion of seismic reflection data in the acoustic approximation. Geophysics 49(8), 1259–1266 (1984). https://doi.org/10.1190/1.1441754
 4.
Murat, F.: Contreexemples pour divers problèmes où le contrôle intervient dans les coefficients. Ann. Mat. Pura Appl. 4(112), 49–68 (1977). https://doi.org/10.1007/BF02413475
 5.
Jiang, J.S., Kuo, K.H., Lin, C.K.: On the homogenization of second order differential equations. Taiwan. J. Math. 9(2), 215–236 (2005). https://doi.org/10.11650/twjm/1500407797
 6.
Murat, F., Tartar, L.: \(H\)convergence. Topics in the mathematical modelling of composite materials. Progr. Nonlinear Differ. Equ. Appl. 31, 21–43 (1997). https://doi.org/10.1007/9781461220329_3
 7.
Tartar, L.: The appearance of oscillations in optimization problems. In: Nonclassical Continuum Mechanics (Durham, 1986), London Math. Soc. Lecture Note Ser., vol. 122, pp. 129–150. Cambridge Univ. Press, Cambridge (1987). https://doi.org/10.1017/CBO9780511662911.008
 8.
Tartar, L.: Homogenization and hyperbolicity. Annali della Scuola Normale Superiore di Pisa  Classe di Scienze Ser. 4, 25(34), 785–805 (1997). http://www.numdam.org/item/ASNSP_1997_4_25_34_785_0
 9.
Tartar, L.: The General Theory of Homogenization. Lecture Notes of the Unione Matematica Italiana, vol. 7. Springer, UMI, Berlin, Bologna (2009)
 10.
Clason, C., Kruse, F., Kunisch, K.: Total variation regularization of multimaterial topology optimization. ESAIM 52(1), 275–303 (2018). https://doi.org/10.1051/m2an/2017061
 11.
Clason, C., Kunisch, K.: Multibang control of elliptic systems. Ann. Institut Henri Poincaré (C) Anal. Non Linéaire 31(6), 1109–1130 (2014). https://doi.org/10.1016/j.anihpc.2013.08.005
 12.
Clason, C., Kunisch, K.: A convex analysis approach to multimaterial topology optimization. ESAIM 50(6), 1917–1936 (2016). https://doi.org/10.1051/m2an/2016012
 13.
Clason, C., Do, T.B.T.: Convex regularization of discretevalued inverse problems. In: Hofmann, B., Leitão, A., Zubelli, J. (eds.) New Trends in Parameter Identification for Mathematical Models, Trends in Mathematics, pp. 31–51. Springer, Berlin (2018). https://doi.org/10.1007/9783319708249_2
 14.
Gröger, K.: A \(W^{1, p}\)estimate for solutions to mixed boundary value problems for second order elliptic differential equations. Math. Ann. 283(4), 679–687 (1989). https://doi.org/10.1007/BF01442860
 15.
Zlotnik, A.A.: Convergence rate estimates of finiteelement methods for secondorder hyperbolic equations. In: Marchuk, G.I. (ed.) Numerical Methods and Applications, pp. 155–220. CRC, Boca Raton, FL (1994)
 16.
Valkonen, T.: A primaldual hybrid gradient method for nonlinear operators with applications to MRI. Inverse Probl. 30(5), 055,012 (2014). https://doi.org/10.1088/02665611/30/5/055012
 17.
Clason, C., Valkonen, T.: Primaldual extragradient methods for nonlinear nonsmooth PDEconstrained optimization. SIAM J. Optim. 27(3), 1313–1339 (2017). https://doi.org/10.1137/16M1080859
 18.
Clason, C., Mazurenko, S., Valkonen, T.: Acceleration and global convergence of a firstorder primaldual method for nonconvex problems. SIAM J. Optim. 29(1), 933–963 (2019). https://doi.org/10.1137/18M1170194
 19.
Bube, K.P.: Convergence of numerical inversion methods for discontinuous impedance profiles. SIAM J. Numer. Anal. 22(5), 924–946 (1985). https://doi.org/10.1137/0722056
 20.
Lavrent’ev Jr., M.M.: An inverse problem for the wave equation with a piecewiseconstant coefficient. Sibirsk. Mat. Zh. 33(3), 101–111 (1992). https://doi.org/10.1007/BF00970893. 219
 21.
Aktosun, T., Klaus, M., van der Mee, C.: Integral equation methods for the inverse problem with discontinuous wave speed. J. Math. Phys. 37(7), 3218–3245 (1996). https://doi.org/10.1063/1.531565
 22.
Sedipkov, A.A.: A direct and an inverse problem of acoustic sounding in a stratified medium with discontinuous parameters. Sib. Zh. Ind. Mat. 17(1), 120–134 (2014)
 23.
Stolk, C.C.: On the modeling and inversion of seismic data. Ph.D. thesis, Universiteit Utrecht (2000). https://dspace.library.uu.nl/handle/1874/855
 24.
Böhm, C.: Efficient inversion methods for constrained parameter identification in fullwaveform seismic tomography. Dissertation, Technische Universität München, München (2015). http://nbnresolving.de/urn/resolver.pl?urn:nbn:de:bvb:91diss20150227123204017
 25.
Goncharsky, A.V., Romanov, S.Y.: A method of solving the coefficient inverse problems of wave tomography. Comput. Math. Appl. 77(4), 967–980 (2019). https://doi.org/10.1016/j.camwa.2018.10.033
 26.
Epanomeritakis, I., Akçelik, V., Ghattas, O., Bielak, J.: A NewtonCG method for largescale threedimensional elastic fullwaveform seismic inversion. Inverse Probl. 24(3), 034,015, 26 (2008). https://doi.org/10.1088/02665611/24/3/034015
 27.
Burstedde, C., Ghattas, O.: Algorithmic strategies for full waveform inversion: \(1\)D experiments. Geophysics 74(6), WCC37–WCC46 (2009). https://doi.org/10.1190/1.3237116
 28.
Esser, E., Guasch, L., van Leeuwen, T., Aravkin, A., Herrmann, F.: Total variation regularization strategies in fullwaveform inversion. SIAM J. Imaging Sci. 11(1), 376–406 (2018). https://doi.org/10.1137/17M111328X
 29.
Yong, P., Liao, W., Huang, J., Li, Z.: Total variation regularization for seismic waveform inversion using an adaptive primal dual hybrid gradient method. Inverse Probl. 34(4), 045,006 (2018). https://doi.org/10.1088/13616420/aaaf8e
 30.
Gao, K., Huang, L.: Acoustic and elasticwaveform inversion with total generalized pvariation regularization. Geophys. J. Int. 218(2), 933–957 (2019). https://doi.org/10.1093/gji/ggz203
 31.
Do, T.B.T.: Discrete regularization for parameter identification problems. Ph.D. thesis, Faculty of Mathematics, University of DuisburgEssen (2019). https://doi.org/10.17185/duepublico/70265
 32.
Hante, F.M., Sager, S.: Relaxation methods for mixedinteger optimal control of partial differential equations. Comput. Optim. Appl. 55(1), 197–225 (2013). https://doi.org/10.1007/s1058901295183
 33.
Ambrosio, L., Fusco, N., Pallara, D.: Functions of Bounded Variation and Free Discontinuity Problems. Oxford Mathematical Monographs. The Clarendon Press, Oxford University Press, New York (2000). https://doi.org/10.1007/9783034889742_2
 34.
Giusti, E.: Minimal Surfaces and Functions of Bounded Variation. Monographs in Mathematics, vol. 80. Birkhäuser Verlag, Basel (1984). https://doi.org/10.1007/9781468494860
 35.
Ziemer, W.P.: Weakly Differentiable Functions. Graduate Texts in Mathematics, vol. 120. Springer, New York (1989). https://doi.org/10.1007/9781461210153
 36.
Barbu, V., Precupanu, T.: Convexity and Optimization in Banach Spaces, fourth edn. Springer Monographs in Mathematics. Springer, Dordrecht (2012). https://doi.org/10.1007/9789400722477
 37.
Lions, J.L., Magenes, E.: Nonhomogeneous Boundary Value Problems and Applications, vol. I. SpringerVerlag, New YorkHeidelberg (1972). https://doi.org/10.1007/9783642651618
 38.
Brezis, H.: Functional Analysis. Sobolev Spaces and Partial Differential Equations. Springer, New York (2010). https://doi.org/10.1007/9780387709147
 39.
Grigor’yan, A.: Heat Kernel and Analysis on Manifolds. AMS/IP Studies in Advanced Mathematics, vol. 47. American Mathematical Society, Providence, RI; International Press, Boston, MA (2009). https://doi.org/10.1090/amsip/047
 40.
Wloka, J.: Partial Differential Equations. Cambridge University Press, Cambridge (1987). https://doi.org/10.1017/CBO9781139171755. Translated from the German by C. B. Thomas and M. J. Thomas
 41.
Edwards, R.E.: Functional Analysis. Theory and Applications. Holt, Rinehart and Winston, New YorkTorontoLondon (1965)
 42.
Grisvard, P.: Elliptic Problems in Nonsmooth Domains. SIAM, Philadelphia, PA (2011). https://doi.org/10.1137/1.9781611972030. Reprint of the 1985 hardback edition
 43.
Yosida, K.: Functional Analysis. Grundlehren der Mathematischen Wissenschaften, vol. 123, 6th edn. Springer, Berlin (1980). https://doi.org/10.1007/9783662257623
 44.
Bredies, K., Holler, M.: A pointwise characterization of the subdifferential of the total variation functional (2012). MOBIS SFBReport 2012011
 45.
Chambolle, A., Goldman, M., Novaga, M.: Fine properties of the subdifferential for a class of onehomogeneous functionals. Adv. Calc. Var. 8(1), 31–42 (2015). https://doi.org/10.1515/acv20120025
 46.
Casas, E., Herzog, R., Wachsmuth, G.: Approximation of sparse controls in semilinear equations by piecewise linear functions. Numer. Math. 122(4), 645–669 (2012). https://doi.org/10.1007/s0021101204757
 47.
Pieper, K.: Finite element discretization and efficient numerical solution of elliptic and parabolic sparse control problems. Dissertation, Technische Universität München, München (2015). http://nbnresolving.de/urn/resolver.pl?urn:nbn:de:bvb:91diss20150420124141314
 48.
Trautmann, C.P.: Sparse measurevalued optimal control problems governed by wave equations. Dissertation, KarlFranzensUniversität Graz, Graz (2015). http://resolver.obvsg.at/urn:nbn:at:atubg:188846
 49.
Rösch, A., Wachsmuth, G.: Mass lumping for the optimal control of elliptic partial differential equations. SIAM J. Numer. Anal. 55(3), 1412–1436 (2017). https://doi.org/10.1137/16M1074473
 50.
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC. Springer, New York (2011). https://doi.org/10.1007/9781441994677
 51.
Clason, C.: Nonsmooth Analysis and Optimization (2017). Lecture notes
 52.
Logg, A., Wells, G.N.: DOLFIN: Automated finite element computing. ACM Trans. Math. Softw. 37(2), 1–28 (2010). https://doi.org/10.1145/1731022.1731030
 53.
Logg, A., Wells, G.N., Hake, J.: DOLFIN: a C++/Python finite element library. In: Logg, A., Mardal, K.A., Wells, G.N. (eds.) Automated Solution of Differential Equations by the Finite Element Method. Springer, Berlin (2012). https://doi.org/10.1007/9783642230998_10
 54.
Alnæs, M.S., Blechta, J., Hake, J., Johansson, A., Kehlet, B., Logg, A., Richardson, C., Ring, J., Rognes, M.E., Wells, G.N.: The FEniCS project version 1.5. Arch. Numer. Softw. 3(100), 9–23 (2015). https://doi.org/10.11588/ans.2015.100.20553
 55.
Logg, A., Mardal, K.A., Wells, G.N. (eds.): Automated Solution of Differential Equations by the Finite Element Method. Lecture Notes in Computational Science and Engineering, vol. 84. Springer (2012). https://doi.org/10.1007/9783642230998
 56.
Feng, X., Lenhart, S., Protopopescu, V., Rachele, L., Sutton, B.: Identification problem for the wave equation with Neumann data input and Dirichlet data observations. Nonlinear Anal. 52(7), 1777–1795 (2003). https://doi.org/10.1016/S0362546X(02)00295X
Acknowledgements
Support by the German Science Fund (DFG) under grant CL 487/11 for C.C. and by the ERC advanced grant 668998 (OCLOC) under the EU’s H2020 research program for K.K. and P.T. are gratefully acknowledged.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Clason, C., Kunisch, K. & Trautmann, P. Optimal Control of the Principal Coefficient in a Scalar Wave Equation. Appl Math Optim (2020). https://doi.org/10.1007/s00245020097339
Accepted:
Published:
Keywords
 Optimal control
 Wave equation
 Total variation regularization
 Primaldual splitting
Mathematics Subject Classification
 49A22
 49K20
 49M29