Optimal control of the principal coefficient in a scalar wave equation

We consider optimal control of the scalar wave equation where the control enters as a coefficient in the principal part. Adding a total variation penalty allows showing existence of optimal controls, which requires continuity results for the coefficient-to-solution mapping for discontinuous coefficients. We additionally consider a so-called"multi-bang"penalty that promotes controls taking on values pointwise almost everywhere from a specified discrete set. Under additional assumptions on the data, we derive an improved regularity result for the state, leading to optimality conditions that can be interpreted in an appropriate pointwise fashion. The numerical solution makes use of a stabilized finite element method and a nonlinear primal-dual proximal splitting algorithm.

This work is concerned with an optimal control problem for the scalar wave equation where the control enters as the spatially varying coe cient in the principal part. Informally, we consider the problem where is a given (desired or observed) state, is a bounded linear observation operator mapping to the observation space O, R is a regularization term, < < are constants, and , , and (as well as boundary conditions) are given suitably. A precise statement is deferred to Section . Such problems occur, e.g., in acoustic tomography for medical imaging [ ] and non-destructive testing [ ] as well as in seismic inversion [ ]. In the latter, the goal is the determination of a "velocity model" (as described by the coe cient ) of the underground in a region of interest from recordings ("seismograms", modeled by ) of re ected pressure waves generated by sources on or near the surface (entering the equation via , , , or inhomogeneous boundary conditions). If the region contains multiple di erent materials like rock, oil, and gas, the velocity model changes rapidly or may even have jumps between material interfaces.
In the stationary case, the question of existence of solutions to problem ( . ) under only pointwise constraints and regularization has received a tremendous amount of attention. However, it was answered in the negative in [ ]; this and subsequent investigations led to the concept of -convergence : .
v , --page of and, more generally, to homogenization theory; see, e.g., [ , , , , ]. The use of regularization terms or constraints involving higher-order di erential operators would certainly guarantee existence but contradicts the goal of allowing piecewise continuous controls . Such considerations suggest the introduction of total variation regularization in addition to pointwise constraints. In this case, existence can be argued. However, this leads to di culties in deriving necessary optimality conditions since the sum rule of convex analysis can only be applied in the ∞ (Ω) topology, which would lead to (generalized) derivatives that do not admit a pointwise representation. This di culty can be circumvented by replacing the pointwise constraints by a (di erentiable approximation of a) cuto function applied to the coe cient in the equation and by using improved regularity results for the optimal state that allow extending the Fréchet derivative of the tracking term from ∞ (Ω) to (Ω) for < ∞ su ciently large. Together, this allows obtaining derivatives and subgradients in (Ω) for some > , which can be characterized in a suitable pointwise manner. This was carried out in [ ], which considered for R a combination of total variation and multi-bang regularization; the latter is a convex pointwise penalty that promotes controls which take values from a prescribed discrete set (e.g., corresponding to di erent materials such as rock, oil, and gas); see also [ , , ].
In the current work, we extend this approach to optimal control and identi cation of discontinuous coe cients in scalar wave equations by deriving under additional (natural) assumptions on the data the adapted higher regularity results for the wave equation based on elliptic maximal regularity theory [ ]; see Assumption and Proposition . below. We also address a suitable discretization of the problem using a stabilized nite element method [ ] and its solution by a nonlinear primal-dual proximal splitting method [ , , ].
Let us brie y comment on related literature. As there is a vast body of work on control and inverse problems for the wave equation, we focus here speci cally on the identi cation of discontinuous (and, in particular, piecewise constant) coe cients. This problem has attracted strong interest over the last few decades, mainly due to its relevance in seismic inversion. Classical works are mainly concerned with the one-dimensional setting -as a model for seismic inversion in strati ed or layered mediawhich allows making use of integral transforms to derive explicit "layer-stripping" formulas; see, e.g., [ , , , ]. Regarding the numerous works on wave speed identi cation in the multidimensional wave equation for seismic inversion, we only mention exemplarily [ , , ]; see also further literature cited there. The use of total variation penalties for recovering a piecewise constant wave speed in multiple dimensions has been proposed in, e.g., [ , , , , ], although the earlier works employed a smooth approximation of the total variation to allow the numerical solution by standard approaches for nonlinear PDE-constrained optimization. Finally, joint multi-bang and total variation regularization of linear inverse problems and its numerical solution by a primal-dual proximal splitting methods were considered in [ ]. We also mention that multi-bang control is related to (but di erent from) switching controls, where at each instant in time, one and only one from a given set of time-dependent controls should be active; see, e.g., [ ].
This work is organized as follows. In the next Section , we give a formal statement of the optimal control problem ( . ) and recall the relevant de nitions and properties of the functional. We then derive in Section the results on regularity, stability, and a priori estimates for solutions of the state equation that will be needed in the rest of the paper. In particular, in Proposition . we show a Groeger-type maximal regularity result for the wave equation under additional assumptions on the data. Section is devoted to existence and rst-order necessary optimality conditions for optimal controls, where we use the mentioned maximal regularity result to show that the latter can be interpreted in a pointwise fashion. We then discuss the numerical computation of solutions using a stabilized nite element discretization (see Section . ) together with a nonlinear primal-dual proximal splitting method (see Section . ). This approach is illustrated in Section for two examples: a transmission setup motivated by acoustic tomography and a re ection setup modeling seismic tomography.
v , --page of Let Ω ⊂ ℝ , ∈ { , }, be a bounded domain with , regular boundary Ω and outer normal . For brevity, we introduce the notation := (Ω) and := (Ω) and set := ( , ). Then we consider for ∈ ( , ), ∈ , and ∈ the weak solution ∈ ( , ) ∩ ( , ) to This choice of Neumann boundary conditions corresponds, e.g., for acoustic waves to the situation of re ection at a sound-hard obstacle and for elastic waves to the absence of external forces at the boundary (which is a natural setting for seismic imaging via interior sources). We will discuss existence and regularity of solutions to ( . ) in the following Section . The salient point is of course the coe cient in the principal part, which we want to control on an open subset ⊆ Ω, which is assumed to have a , regular boundary. For constants , with < < < ∞ we de ne the set of admissible coe cients and pick a reference coe cientˆ∈ˆ. To map a control de ned on to a coe cient de ned on Ω, we introduce the a ne bounded extension operator The set of controls that can be extended to admissible coe cients is then given by where min < max are such that ≤ inf ∈ˆ( ) + min ≤ sup ∈ˆ( ) + max ≤ . In particular, for ≡ , we have min = and max = − . Moreover, we introduce the observation space O which is assumed to be a separable Hilbert space as well as a linear and bounded observation operator ∈ ( ( ), O) with adjoint * ∈ (O, ( )).
We then consider the optimal control problem ( . ) min where ( ) is a weak solution to ( . ), is the multi-bang penalty from [ , ], TV denotes the total variation, and and are positive constants. In the remainder of this section, we recall the de nitions and properties of the total variation and the multi-bang penalty relevant to the current work.
Multi-bang penalty Let min ≤ < · · · < ≤ max be a given set of desired coe cient values. The multi-bang penalty is then de ned similar to [ ], where we have to replace the box constraints ( ) ∈ [ , ] by a linear growth to ensure that is nite on ( ), < ∞. For simplicity, we assume in the following that = min = and = max = − (i.e.,ˆ= ) and de ne where : ℝ → ℝ is given by This de nition can be motivated via the convex envelope of [ , ] ( ) + | | (where denotes the indicator function in the sense of convex analysis), see [ ]; note however that here (as in [ ]) is de ned to be nite for every ∈ ℝ, while the convex envelope is only nite for ∈ [ , ]. We also remark that for = , this reduces in the current setting to the well-known sparsity penalty (i.e., ( ) = ( ) for any ∈ ). It can be veri ed easily that is continuous, convex, and linearly bounded from above and below, i.e., | | ≤ ( ) ≤ | | for all ∈ ℝ.
Proof. Except for the estimate on ( , * ) , the claim follows from [ , Theorem . . , page ], where we observe that due to our assumption on ∈ˆ, the energy is coercive with respect to the seminorm in ; see also [ , Theorem . . ]. The constant depends on and , but is otherwise independent of ∈ˆ.
Using this result, we can apply an Aubin-Nitsche trick or duality argument to show Lipschitz continuity of ↦ → ( ) ∈ ( , ), which we will need to show di erentiability of the tracking term later. Lemma . . There exists a constant > such that the mapping ↦ → ( ) satis es for all , ∈ˆ.
In stronger norms, we only have the following weak continuity result, which will be used repeatedly.
From ( . ), we in particular have that for arbitrary ∈ ∩ ( , (Ω)) with ( ) = . Since and thus for all ∈ [ , ∞) Then we can pass to the limit in the weak formulation to obtain for all such . Since ∈ˆand since the set of functions with ∈ ∩ ( , (Ω)) and ( ) = is dense in { ∈ : ( ) = }, the last equation also holds for all ∈ with ( ) = . The density can be shown by adapting the density argument of ∞ (Ω) in ; see, e.g., [ , Cor. . ].
Under additional assumptions, we can show an improved regularity result. Assumption . The data satisfy ∈ ( ; ) and ( , ) ∈ (Ω) × (Ω) with = . Furthermore, The following result will be used to show Fréchet di erentiability of the tracking term in Lemma . below. Proposition . . Let ∈ˆand Assumption hold. Then there exists > and a constantˆindependent of such that ( ) ∈ ∞ ( , , (Ω)) and for all ∈ˆ, Proof. We proceed in two steps.
Step . We relax the requirements on the problem data and choose an arbitrary ( , , As this is standard for the second and third component, we only address the rst one. Let := Ω \ . By assumption, Ω ∩ = ∅; in addition, Ω and are , domains, and thus is a , domain as well. It is in this step that the , regularity of the domains is used. Since Denoting by ,ext and ext the extensions of and by the constant on , we have ,ext → ext ∈ (Ω), : . v , --page of ,ext | Ω = , ,ext | = ext | = , and ,ext ∈ (Ω), where we use that | = . Next, observe that − ∈ ( ). This implies the existence of functions ∈ ∞ ( ) with compact support in such that → − in ( ); see, e.g., [ , pp. and ]. Denoting the extension by zero to of by ,ext , we have ,ext ∈ (Ω), ,ext → − ext in (Ω), and ,ext = | Ω . Finally, the sequence = ,ext + ,ext de nes the desired approximation of such that | = , | Ω = , and → in (Ω).
Remark . . If Assumption holds, the requirement in Lemma . on the convergence of can be relaxed to → in − (Ω), where is the exponent from Proposition . . In this case, the rst term on the right-hand side in ( . ) can be estimated by Hölder's inequality as where we used Proposition . . Then again ( ) → ( ) in .
Remark . . In Assumption (ii), the requirement ⊂ Ω was only used in Step of the proof of Proposition . . It it is not necessary if instead ∈ (Ω) is assumed.
Deriving useful optimality conditions requires replacing the pointwise control constraints with a di erentiable approximation of a cuto function. We thus introduce the superposition operator and ≥ and Fréchet di erentiable from ∞ ( ) → ∞ ( ) for > (and thus ensuring Fréchet di erentiability of the tracking term; see Lemma . below). The construction of such a and the characterization of the Fréchet derivative of Φ via pointwise a.e. multiplication can be carried out in the same way as in [ , § . ].
We then consider for ≥ the reduced, unconstrained optimization problem for some ∈ O and , > , where ↦ → ( ) denotes the solution mapping of ( . ) introduced in the previous section andˆis the a ne extension operator from to Ω de ned in ( . ). We point out that the role of is not that of a smoothing parameter for the optimization problem, which remains nonsmooth for > due to the presence of and TV; it merely in uences the behavior of the cuto function near the upper and lower values of the pointwise bounds for the coe cient.
Proof. Since is bounded from below, there exists a minimizing sequence { } ∈ℕ ⊂ ( ). Furthermore, we may assume without loss of generality that there exists a > such that ( ) + TV( ) ≤ ( ) ≤ ( ) for all ∈ ℕ, and hence that { } ∈ℕ is bounded in ( ). By the compact embedding of ( ) into ( ) for any ∈ ℕ, we can thus extract a subsequence, denoted by the same symbol, converging strongly in ( ) to some¯∈ ( ). Due to the continuity of Φ as well as ofˆ, we haveˆΦ ( ) →ˆΦ (¯) ∈ in ( ).
Lower semi-continuity of and TV with respect to the strong convergence in ( ) and the weak convergence (ˆΦ ( )) ⇀ (ˆΦ (¯)) in ( ) from Lemma . yield that and thus that¯∈ ( ) is the desired minimizer.
The fact that¯∈ then follows by a contraposition argument based on Stampacchia's Lemma for functions and the pointwise de nition of ; see [ , Prop. . ].
The convergence of minimizers of ( . ) as → can be shown along the same lines as indicated at the end of [ , § ].
We now derive rst-order optimality conditions for the solution of ( . ). To this end, we rst show Fréchet di erentiability of the tracking term for any ∈ with ( ) = , which admits a unique solution ∈ by Lemma . . In the following, we use the regularity of solutions to identify the derivative in ∞ ( ) * with its representation in ( ), considered as a subset of ∞ ( ) * . Since the extension operatorˆis a ne, we also introduce the corresponding linear extension operatorˆ =ˆ : ( ) → (Ω). Lemma . . For every > , the mapping de ned in ( . ) is Fréchet di erentiable in every ∈ ∞ ( ), and the Fréchet derivative is given by where = (ˆΦ ( )) is the solution of ( . ), is the solution of ( . ), andˆ * : (Ω) → ( ) is the restriction operator.
We can now proceed exactly as in [ ] to obtain rst-order necessary optimality conditions. where and TV are considered as extended real-valued convex functionals on − ( ).
These conditions can be further interpreted pointwise. First, using the characterization of Lemma . , we can identify the rst term in the rst equation with the + ( ) function given by Second, using the characterization of from [ , § ], we have that The interpretation of the nal term is more delicate. Informally, ( ) corresponds to the mean curvature of¯( ) (if¯is smooth at ) or the signed normal to its jump set (if¯has a jump discontinuity across a measurable curve of − -dimensional Hausdor measure greater zero). This can be made more precise using the notion of the full trace from [ ]; see also [ ].
In this section, we address the numerical solution of ( . ) using a stabilized space-time nite element discretization for second-order hyperbolic equations [ ] and a nonlinear primal-dual proximal splitting algorithm [ , , ]. Since we now consider a nite-dimensional optimization problem, we can include the constraint ∈ directly via the multi-bang penalty instead of enforcing it inside the state equation. In this and the following section, we will therefore omit Φ from the discretized tracking term (and, with it, in general) and de ne the multi-bang penalty with dom = [ , ] as in [ ]; see ( . ) below.
. We consider a mesh T ℎ consisting of a nite set of triangles or tetrahedra with a mesh size ℎ. Then we introduce the space ℎ ⊂ (Ω) ∩ (Ω) of linear nite elements based on the triangulation T ℎ . A : . v , --page of basis of this space is given by the standard hat functions associated with nodes , = , . . . , ℎ , of the triangulation T ℎ . Next we discretize the time interval uniformly by = < · · · < = and grid size of . Similarly, we de ne the space ⊂ ( ) ∩ ( ) of piecewise linear and continuous functions with respect to these grids. Furthermore we consider the hat functions , = , . . . , , with ( ) = which form a basis of . We assume that can be represented by the triangulation T ℎ exactly and introduce the space Moreover we introduce the space ℎ of piecewise constant functions on the triangles in . In the following we also identify ℎ with ℝ for dim ℎ = and ℎ with ℝ for dim ℎ = . Finally we de ne := ( , ℎ) > and introduce the stabilization parameter ≥ . Definition . . We call ∈ := ℎ ⊗ a discrete solution of ( . ) if satis es This is a space-time nite element discretization with piecewise linear elements in space and in time. The additional -term in ( . ) serves as a stabilization term, which vanishes for → and is connected to the error term in the trapezoidal rule for the time integral of the third bilinear form in ( . ). The stability depends signi cantly on the value of with the method being more stable for larger ; e.g., for ≥ / , the method is unconditionally stable and convergent while for ≤ < / , a CFL-like condition has to be satis ed to ensure stability as well as convergence; see [ , Thms. . , . ] for homogeneous Dirichlet boundary conditions. At the same time, ( . ) can be formulated as the following time-stepping scheme: , , for all ∈ ℎ . For = / , this method is equivalent to the implicit, unconditionally stable, and convergent Crank-Nicolson scheme, while for = , the method is explicit if the spatial mass matrix is lumped. The main bene t in our context is that this is an adjoint-consistent discretization and therefore can be used to obtain a conforming discretization of ( . ) in a straight-forward manner. Next we introduce the discrete control-to-observation operator : ∩ ℎ → O de ned by ↦ → where is the solution of ( . ) for the coe cient ∈ ∩ ℎ . Let > . Then the implicit function theorem implies that is Fréchet di erentiable on the open subset for a.e. ∈ ∩ ℎ . : . v , --page of This set contains ∩ ℎ . The implicit function theorem is applicable since the following linearized discrete state equation is well-posed in the variable ∈ for every ℎ ∈ ℎ : ( . ) for all ∈ with ( ) = and initial condition ( ) = as well as = ( ). Thus the derivative of at is given by where solves ( . ) for ℎ . Its adjoint (with respect to the ( ) and O inner product) is given by solves the discrete adjoint equation for all ∈ with ( ) = and initial condition ( ) = , which can be formulated as a time-stepping scheme similar to ( . ).
We now introduce the variables ℎ and ℎ de ned by Thus ( ( ℎ )) * with ∈ O has the representation where is given by with the temporal mass matrix and sti ness matrix . For = , is a diagonal matrix.
We now address the discretization of the control costs in the optimization ( . ). Since a function ℎ ∈ ℎ is an element of ( ) and thus its weak derivatives are piecewise constant on the triangulation of , we have We furthermore approximate the integral in de nition of by the trapezoidal rule to obtain the discrete multi-bang penalty ℎ : ℎ → ℝ. Using these de nitions, we obtain the fully discrete optimization problem : . v , --page of Note that although ( . ) is discrete, it is still formulate in function spaces. To apply a minimization algorithm, we now reformulate it in terms of the coe cient vectors for the nite-dimensional functions. First, ℎ ∩ can be identi ed with the set Next we introduce the nite-dimensional subspace O ℎ = ( ) of O and the discrete control-to-observation on ℎ by S : ℝ → ℝ with = dim(O ℎ ) de ned by . Moreover we de ne the matrix O ∈ ℝ × by the mapping ( , ) ↦ → ( , ) O for , ∈ O ℎ . Thus the inner product and the norm of We denote the orthogonal projection onto O ℎ by O . The operator restricted to can be identi ed with a matrix ℎ ∈ ℝ × for = dim( ). Thus * can be identi ed with ℎ . With these identi cations, the adjoint operator ( ℎ ) * for ℎ ∈ ∩ ℎ acting on O ℎ can similarly identi ed with S (u) * : ℝ → ℝ for some u ∈ ℎ . Moreover we de ne the matrix ℎ ∈ ℝ × representing the bilinear form (∇ ℎ , ℎ ) (Ω) for ℎ ∈ ℎ and ℎ ∈ ( ℎ ) . Thus we have Finally, the trapezoidal rule in the de nition of ℎ can be expressed in the form of a mass lumping scheme, i.e., is the scalar multi-bang penalty including the box constraints from [ ] and = ∫ d are the diagonal entries of the lumped mass matrix; see [ , , , ].
Using these notations, we can write ( . ) equivalently in the form .
To solve ( . ), we extend the approach in [ ] by applying the nonlinear primal-dual proximal splitting method from [ , , ] together with a lifting trick. For this purpose, we write ( . ) (omitting the bold notation and the subscripts denoting vectors and discretizations from now on and assuming that ∈ O ℎ ) as . v , --page of we can apply the nonlinear primal-dual proximal splitting algorithm for step sizes , > satisfying ( ) * < . Convergence can be guaranteed under a secondorder type condition for and possibly further restrictions on the step sizes, whose (very technical) veri cation is outside the scope of this work. Instead, we restrict the discussion here on deriving the explicit form of ( . ) in the present setting.
First, we endow (ℝ × ℝ ) with the sum of the inner product induced by O (for ℝ ) and the Euclidean inner product (for ℝ ). With respect to this inner product, we obtain the adjoint Fréchet derivative ( ) * ( , ) = ( ) * ( ) + ℎ , where ( ) * is the fully discrete operator corresponding to ( . ) with right-hand side ∈ O for the adjoint equation.
The proximal point mapping for the (scaled) multi-bang penalty can be obtained by straightforward calculation based on a case di erentiation in the de nition of the subdi erential, see [ , Prop. . ]; for the sake of completeness, we give the short derivation here in full. By the de nition of the proximal mapping, = prox ( ) = (Id + ) − ( ) holds for any ∈ ℝ if and only if ∈ { }+ ( ). Recalling from [ , § ] that we now distinguish the following cases for : (i) = : In this case, ∈ { } + −∞, ( + ) = −∞, ( + ) + .
(iii) = , < < : Proceeding as in the rst case, we obtain ∈ − , ( + ) , ( + ) + + . : .  (iv) = : Similarly, this implies that Since this is a complete and disjoint case distinction for ∈ ℝ, we obtain the proximal mapping for the scalar penalty ; see Figure . By a standard argument, the proximal point mapping for ℎ is thus given componentwise by where we have set = −∞ and + = ∞ to avoid the need for further case distinctions. (Note that we compute the proximal mapping with respect to the inner product induced by the lumped mass matrix such that the weight cancels.) Finally, for F , we rst compute the Fenchel conjugate on ℝ × ℝ (with respect to the same inner product as above) as to obtain the proximal point mapping (again, with respect to this inner product) where the projection can be computed elementwise for each ∈ T ∩ as With these, ( . ) becomes the following explicit algorithm: .
v , --page of Note that this requires two solutions of the forward wave equation (as well as one solution of the adjoint equation) in each iteration, since + is based on the extrapolated vector¯+ , while the state vector required for the computation of ( + ) * + in the following iteration is based on the original update + . The iteration is terminated based on the residual norm in an equivalent reformulation of the optimality conditions for ( . ). Combining the approach of Section with standard results from convex analysis (see, e.g, [ , ]), any local minimizer¯of ( . ) together with the corresponding Lagrange multiplier¯and the residual¯:= (¯) − can be shown to satisfy For the rst equation, which holds in ℎ , we measure the residual in the discrete norm induced by the lumped mass matrix as in the de nition of ℎ . The second equation holds in O ℎ , and hence we measure the residual in the norm induced by the corresponding mass matrix O . Finally, the last equation holds in ℝ so we use the standard Euclidean norm. The iteration is terminated once the sum of these residuals drops below a given tolerance. For the implementation, note that the residual in the rst equation for ( , , ) reduces to − + . On the other hand, the residual in the second equation requires an additional solution of the state equation since here is applied to instead of the extrapolated¯. In practice, we thus do not evaluate the stopping criterion in every iteration.
We now illustrate the above presented approach with two numerical examples. The rst is a transmission problem (where waves produced by external forcing pass through the control domain before being observed) loosely motivated by acoustic tomography. The second is a re ection problem (where only re ected, not transmitted, waves are observed) that more closely models seismic inversion. Clason, Kunisch, Trautmann Optimal control of the principal coe icient in a scalar wave . . .

:
. v , --page of Figure : transmission example, exact coe cient (The number and location of source points as well as the amplitude and frequency of the wavelet are chosen such as to obtain a complex enough wave pattern to recover the lateral and depth-wise variations in the coe cient.) The discretization is performed using nodes in each space direction and nodes in time, corresponding to ℎ ≈ . and ≈ . . The stabilization constant is set to = / . The discretized exact solution is then perturbed componentwise by % relative Gaussian noise, i.e., we take = ( ( )) + .
where is a vector of independent normally distributed random components with mean and variance .
We now compute the reconstruction using the algorithm described in Section . , comparing the e ects of the total variation and the multi-bang penalty by taking ∈ { , − } and ∈ { , − }. In each case, we set the step sizes to = − and = and terminate when the residual norms (evaluated every iterations) drop below − . Again, these parameters are chosen to achieve a reasonable reconstruction in as few iterations as possible. (A proper parameter choice rule depending on the noise level and the discretization is left for future work.) The results can be seen in Figure . The case of pure multi-bang regularization ( = − and = , iterations); see Figure a) shows that indeed ( ) ∈ { , . . . , } almost everywhere; however, there is a clear lack of regularity of the reconstruction, which is not surprising as the original function-space problem is not well-posed for = . In contrast, the reconstruction case of pure TV regularization ( = and = − , iterations; see Figure b) shows a much more regular reconstruction that is constant on large regions; however, these constants are not necessarily from the admissible set { , . . . , }. Finally, combining both multi-bang and total variation regularization ( = − and = − , iterations; see Figure c) allows recovering more admissible values at the price of penalizing the magnitude of the coe cient value, which prevents the largest value = . from being attained. It is also noteworthy that in this case the tolerance for the residual norm is reached after signi cantly fewer iterations.
To illustrate the e ects of variation of the desired values on the reconstruction, we recompute the last example with the same parameters , but % increased values, i.e., = . ( − )/ , = , . . . , . The results are shown in Figure , where we repeat the exact coe cient from Figure with adjusted labels in Figure a for better comparison. As can be seen from Figure b, the reconstruction is similar to that for = . In particular, the total variation penalty prevents the misspeci ed desired values from being enforced strongly. This demonstrates that while misspeci ed values clearly do not have the same positive in uence on the reconstruction, they at least do not have a negative in uence. . We next consider an example which is inspired from seismic tomography. We assume that the data is given in the form of a time series of mean values of the re ected waves over certain spatial regions . Thus we de ne the observation space O = ( ) for ∈ ℕ and the observation operator where the ⊂ Ω are the spatial observation patches. Furthermore we assume that seismic sources are given by point sources located on the surface Γ ⊂ Ω whose magnitudes are time dependent and follow a Ricker wavelet of the form . v , --page of with ℎ, , ∈ ℝ . This leads to the modi ed state equation , and , , , are uniform random numbers in [ , ]. Here we take = .
For the discretization, we take a tensorial-based triangular mesh with ℎ = , = , and = / . The relative noise level is = . . An appropriate regularization parameter is given by = − ; for simplicity, we set = . The iteration is initialized with = and the stepsizes are again chosen as = − and = . The iteration is stopped if the absolute residuum is smaller than − ; in this experiment, this was reached after iterations. Figure shows the exact and noisy observations on , and . At the onset, we note two high spikes (a negative and a positive one) which are caused by the source wave initiated on boundary points Γ . The remaining oscillations are caused by the re ection waves originating from the discontinuities of and from the re ecting boundary; only these carry information about the coe cient, which makes the reconstruction challenging. The results are shown in Figure b, where each color map is scaled individually to show more details. We observe that the positions of the discontinuities in that are close to the observation patches are well approximated in¯and that the corresponding interfaces are quite sharp. However, the approximation quality of the discontinuities becomes worse farther away from the observation region. This is caused by the fact that re ected waves from lower sections of the discontinuities are more dispersed than the re ected waves from the upper sections of the discontinuities. We showed existence of solutions to an optimal control problem for the wave equation with the control entering into the principal part of the operator using total variation regularization and a reformulation of pointwise constraints using a cuto function. Preferential attainment of a discrete set of control values is incorporated through a multi-bang penalty. We also derived an improved regularity result for solutions of the wave equation under additional natural assumptions on the data and the control, which (for smooth cuto functions) allows obtaining necessary optimality conditions that can be interpreted in a suitable pointwise fashion. Finally, we demonstrated that the optimal control problem can be solved numerically using a combination of a stabilized nite element discretization and a nonlinear primal-dual proximal splitting algorithm. This work can be extended in several directions. Besides applying the proposed approach to more realistic models of acoustic tomography or seismic imaging for practical applications, it would be worthwhile to consider the case of boundary observations of the state [ ], which however may lead to an unbounded observation operator . A further challenging goal would be deriving su cient second-order conditions. Such conditions could then be used for obtaining discretization error estimates for the optimal controls or for showing convergence of the nonlinear primal-dual proximal splitting algorithm based on the "three-point condition" on from [ ].