1 Introduction

Let \(\alpha>0,~\beta >0\), and let \(\Omega \subset {\mathbb {R}}^3\) denote some open and bounded domain having a smooth boundary \(\Gamma =\partial \Omega \) and the unit outward normal \(\,\mathbf{n}\). We denote by \(\partial _\mathbf{n}\) the outward normal derivative to \(\Gamma \). Moreover, we fix some final time \(T>0\) and introduce for every \(t\in (0,T)\) the sets \(Q_t:=\Omega \times (0,t)\) and \(Q^t:=\Omega \times (t,T)\). Furthermore, we set \(Q:=Q_T\) and \(\Sigma :=\Gamma \times (0,T)\). We then consider the following optimal control problem:

(\({\mathcal C} {\mathcal P}\))    Minimize the cost functional

$$\begin{aligned} {{\mathcal {J}}}((\mu ,\varphi ,\sigma ),\mathbf{u}):= & {} \frac{b_1}{2} \iint _Q |\varphi -{\widehat{\varphi }}_Q|^2 + \frac{b_2}{2} \int _\Omega |\varphi (T)-{\widehat{\varphi }}_\Omega |^2 + \frac{b_0}{2} \iint _Q |\mathbf{u}|^2 \,+\,\kappa \,g(\mathbf{u})\nonumber \\=: & {} {{\mathcal {J}}}_1((\mu ,\varphi ,\sigma ),\mathbf{u}) + \kappa g(\mathbf{u}) \end{aligned}$$

subject to the state system

$$\begin{aligned}&\alpha \partial _t\mu +\partial _t\varphi -\Delta \mu =P(\varphi )(\sigma +\chi (1-\varphi )-\mu ) - \mathbbm {h}(\varphi )u_1 \quad&\hbox {in }\,Q , \end{aligned}$$
$$\begin{aligned}&\beta \partial _t\varphi -\Delta \varphi +F_1'(\varphi ) + F_2'(\varphi )=\mu +\chi \,\sigma \quad&\hbox {in }\,Q , \end{aligned}$$
$$\begin{aligned}&\partial _t\sigma -\Delta \sigma =-\chi \Delta \varphi -P(\varphi )(\sigma +\chi (1-\varphi )-\mu )+u_2\quad&\hbox {in }\,Q , \end{aligned}$$
$$\begin{aligned}&\partial _\mathbf{n}\mu =\partial _\mathbf{n}\varphi =\partial _\mathbf{n}\sigma =0 \quad&\hbox {on }\,\Sigma , \end{aligned}$$
$$\begin{aligned}&\mu (0)=\mu _0,\quad \varphi (0)=\varphi _0,\quad \sigma (0)=\sigma _0\,\quad&\hbox {in }\,\Omega , \end{aligned}$$

and to the control constraint

$$\begin{aligned} \mathbf{u}=(u_1,u_2)\in {{\mathcal {U}}}_{\mathrm{ad}}. \end{aligned}$$

Here, \(b_1,b_2, \kappa \) are nonnegative constants, while \(\,b_0\,\) is positive. Moreover, \({\widehat{\varphi }}_Q\) and \({\widehat{\varphi }}_\Omega \) are given target functions, and g denotes a convex but not necessarily differentiable functional that may account for possible sparsity effects; a typical case is \(g(\mathbf{u})=\Vert \mathbf{u}\Vert _{{L^1(Q)}^2}\). Moreover, the set of admissible controls \({{\mathcal {U}}}_{\mathrm{ad}}\) is a nonempty, closed and convex subset of the control space

$$\begin{aligned} {{\mathcal {U}}}:= L^\infty (Q)^2. \end{aligned}$$

The state system (1.2)–(1.6) constitutes a simplified and relaxed version of the four-species thermodynamically consistent model for tumor growth originally proposed by Hawkins-Daruud et al. [36]. Let us briefly review the role of the occurring symbols. The primary variables \(\varphi , \mu , \sigma \) denote the tumor fraction, the associated chemical potential, and the nutrient concentration, respectively. Furthermore, the additional term \(\alpha \partial _t\mu \) corresponds to a parabolic regularization of Eq. (1.2), while \(\beta \partial _t\varphi \) is the viscosity contribution to the Cahn–Hilliard equation. The nonlinearity P denotes a proliferation function, whereas the positive constant \(\chi \) represents the chemotactic sensitivity and provides the system with a cross-diffusion coupling. The evolution of the tumor fraction is mainly governed by the nonlinearities \(F_1\) and \(F_2\) whose derivatives occur in (1.3). Here, \(F_2\) is smooth, typically a concave function. As far as \(F_1\) is concerned, we consider in this paper the functions

$$\begin{aligned}&F_{\mathrm{1,log}}(r)=\left\{ \begin{array}{ll} (1+r)\,\ln (1+r)+(1-r)\,\ln (1-r)\quad &{} \quad \hbox {for} \,r\in (-1,1)\\ 2\,\ln (2)\quad &{} \quad \hbox {for} \,r\in \{-1,1\} ,\\ +\infty \quad &{} \quad \hbox {for} \,r\not \in [-1,1] \end{array}\right. \end{aligned}$$
$$\begin{aligned}&I_{[-1,1]}(r)=\left\{ \begin{array}{ll} 0\quad &{} \quad \hbox {for} \,r\in [-1,1]\\ +\infty \quad &{} \quad \hbox {for} \,r\not \in [-1,1] \end{array}\right. . \end{aligned}$$

We assume that \(I_{[-1,1]}+F_2\) is a double-well potential. This is actually the case if \(F_2(r)=k(1-r^2)\), where \(k>0\); the function \(I_{[-1,1]}+F_2\) is then referred to as a double obstacle potential. Note also that \(F'_{\mathrm{1,log}}(r)\) becomes unbounded as \(r\searrow -1\) and \({r\nearrow 1}\), and that in the case of (1.10) the second equation (1.3) has to be interpreted as a differential inclusion, where \(F_1'(\varphi )\) is understood in the sense of subdifferentials. Namely, (1.3) has to be written as

$$\begin{aligned} \beta \partial _t\varphi -\Delta \varphi +\xi +F_2'(\varphi )=\mu +\chi \sigma , \quad \xi \in \partial I_{[-1,1]}(\varphi ). \end{aligned}$$

The control variable \(u_1\), which is nonlinearly coupled to the state variable \(\varphi \) in the phase Eq. (1.2), models the application of a cytotoxic drug to the system; it is multiplied by a truncation function \(\mathbbm {h}({\cdot })\) in order to have the action only in the spatial region where the tumor cells are located. Typically, one assumes that \(\mathbbm {h}(-1)=0, \mathbbm {h}(1)=1\), and \(\mathbbm {h}(\varphi )\) is in between if \(-1<\varphi <1\); see [27, 33, 39, 40] for some insights on possible choices of \(\mathbbm {h}\). Let us notice that, from the modeling viewpoint, the function \(\mathbbm {h}\) should be a nonnegative function. However, as for our mathematical analysis we just need \(\mathbbm {h}\) to be uniformly bounded, we will require no sign restriction later on. Besides, the control \(u_2\) can model either an external medication or some nutrient supply.

As far as well-posedness is concerned, the above model was already investigated in the case \(\chi =0\) in [4, 6,7,8], and in [23] with \(\alpha =\beta =\chi =0\). There the authors also pointed out how the relaxation parameters \(\alpha \) and \(\beta \) can be set to zero, by providing the proper framework in which a limit system can be identified and uniquely solved. We also note that in [12] a version has been studied in which the Laplacian in Eqs. (1.2)–(1.4) has been replaced by fractional powers of a more general class of selfadjoint operators having compact resolvents. A model which is similar to the one studied in this note was the subject of [15, 47].

For some nonlocal variations of the above model we refer to [25, 26, 42]. Moreover, in order to better emulate in-vivo tumor growth, it is possible to include in similar models the effects generated by the fluid flow development by postulating a Darcy’s law or a Stokes–Brinkman’s law. In this direction, we refer to [19, 22, 25, 27,28,29,30,31, 33, 51], and we also mention [34], where elastic effects are included. For further models, discussing the case of multispecies, we refer the reader to [19, 25].

The investigation of associated optimal control problems also presents a wide number of results of which we mention [9, 12, 15, 20, 21, 26, 32, 35, 40, 43,44,45, 47, 48]. The optimal control problem (\({\mathcal C} {\mathcal P}\)) has recently been investigated by the present authors in [16] for the case of regular or logarithmic nonlinearities \(F_1\). For such nonlinearities, well-posedness of the state system (1.2)–(1.6), suitable differentiability properties of the control-to-state mapping, the existence of optimal controls, as well as first-order necessary and second-order sufficient optimality conditions could be established. In this paper, we focus on the nondifferentiable case when \(F_1=I_{[-1,1]}\). While a well-posedness result was proved in [16] also for this case (in which (1.3) has to be replaced by the inclusion (1.11)), the corresponding optimal control problem has not yet been treated. While the existence of optimal controls is not too difficult to show, the derivation of necessary optimality conditions is challenging since standard constraint qualifications to establish the existence of suitable Lagrange multipliers are not available. In order to handle this difficulty, we employ the so-called deep quench approximation which has proven to be a useful tool in a number of optimal control problems for Cahn–Hilliard systems involving double obstacle potentials (cf., e.g., the papers [5, 10, 12,13,14, 44]).

In all of these works, the starting point was that the optimal control problem (we will later denote this problem by (\({\mathcal {CP}}_\gamma \))) had been successfully treated (by proving Fréchet differentiability of the control-to-state operator and establishing first-order necessary optimality conditions in terms of a variational inequality and the adjoint state system) for the case when in the state system (1.2)–(1.6) the nonlinearity \(F_1\) is, for \(\gamma >0\), given by

$$\begin{aligned} F_{1,\gamma }:=\gamma \, F_{\mathrm{1,log}}. \end{aligned}$$

We obviously have that

$$\begin{aligned} 0\,\le \,F_{1,{\gamma _1}}(r)\le & {} F_{1,{\gamma _2}}(r)\quad \forall \,r\in {\mathbb {R}}, \quad \hbox {if }\,0<\gamma _1<\gamma _2, \end{aligned}$$
$$\begin{aligned} \lim _{\gamma \searrow 0} F_{1,\gamma }(r)= & {} I_{[-1,1]}(r)\quad \forall \,r\in {\mathbb {R}}. \end{aligned}$$

In addition, we note that \(\,F_\mathrm{1,log}'(r)=\ln \left( \frac{1+r}{1-r}\right) \)  and  \(F_\mathrm{1,log}''(r)=\frac{2}{1-r^2}>0\)  for \(r\in (-1,1)\), and thus, in particular,

$$\begin{aligned}&\lim _{\gamma \searrow 0}\, F_{1,\gamma }'(r)= {\lim _{\gamma \searrow 0}\, \gamma \, F_{1,\mathrm log}'(r) = 0} \quad \hbox {for }\,-1<r<1,\\&\lim _{\gamma \searrow 0} \Bigl (\,\lim _{r\searrow -1}F_{1,\gamma }'(r)\Bigr )=-\infty , \quad \lim _{\gamma \searrow 0} \Bigl (\,\lim _{{r\nearrow 1}}F_{1,\gamma }'(r)\Bigr )=+\infty . \end{aligned}$$

We may therefore regard the graphs of the single-valued functions

$$\begin{aligned} F_{1,\gamma }'(r)\,=\, \gamma \, F_{\mathrm{1,log}}'(r), \quad \hbox {for}\quad r\in (-1,1)\quad \hbox {and}\quad \gamma >0, \end{aligned}$$

as approximations to the graph of the multi-valued subdifferential \(\partial I_{[-1,1]}\) from the interior of \((-1,1)\).

For both \(F_1=I_{[-1,1]}\) (in which case (1.3) has to be replaced by the inclusion (1.11)) and \(F_1=F_{1,\gamma }\) (where \(\gamma >0\)), the well-posedness results from [16] yield the existence of a unique solution \((\mu ,\varphi ,\sigma )\) and \((\mu _\gamma ,\varphi _\gamma ,\sigma _\gamma )\) to the state system (1.2)–(1.6) provided that the controls \(\mathbf{u}=(u_1,u_2)\) belong to \(L^\infty (0,T;{L^2(\Omega )})^2\). It is natural to expect that \((\mu _\gamma , \varphi _\gamma ,\sigma _\gamma )\rightarrow (\mu ,\varphi ,\sigma )\) as \(\gamma \searrow 0\) in a suitable topology.

Below (cf. Theorem 3.1), we will show that this is actually true. Owing to the construction, the approximating functions \(\,\varphi _\gamma \,\) automatically attain their values in the domain of \(I_{[-1,1]}\); that is, we have \(\Vert \varphi _\gamma \Vert _{L^\infty (Q)}\,\le \,1\)  for all \(\gamma >0\).

Let us now consider the control problem, which in the following will be denoted by (\({\mathcal {CP}}_0\)) if \(F_1=I_{[-1,1]}\) and by (\({\mathcal {CP}}_\gamma \)) if \(F_1=F_{1,\gamma }\). The general strategy is then to derive uniform (with respect to \(\gamma \in (0,1]\)) a priori estimates for the state and adjoint state variables of an “adapted” version of (\({\mathcal {CP}}_\gamma \)) that are sufficiently strong as to permit a passage to the limit as \(\gamma \searrow 0\) in order to derive meaningful first-order necessary optimality conditions also for (\({\mathcal {CP}}_0\)). It turns out that this strategy succeeds.

Another remarkable novelty of this paper is the discussion of the sparsity of optimal controls for (\({\mathcal {CP}}_0\)). Since the seminal paper [49], sparse optimal controls have been discussed extensively in the literature. Directional sparsity was introduced in [37, 38] and extended to semilinear parabolic optimal control problems in [1]. Sparse optimal controls for reaction-diffusion equations were investigated in [2, 3]. In the recent work [47], sparsity results that apply to nonlinearities \(\,F_1\,\) of logarithmic type were established for a slightly different state system. In view of the medical background, the focus in [47] was set on sparsity with respect to time, since temporal sparsity means that the controls (e.g., cytotoxic drugs) are not needed in certain time periods. It turns out that the technique used in [47] can be adapted to establish sparsity results also for our state system for the nondifferentiable case \(F_1=I_{[-1,1]}\) in which the evolution of the tumor fraction is governed by a variational inequality. The results obtained, however, are weaker than those recovered in [47] for the differentiable case. This is not entirely unexpected in view of the fact that less information on the adjoint state variables can be recovered from the corresponding adjoint state system than in the simpler differentiable situation.

The paper is organized as follows: in Sect. 2, we collect auxiliary results on the state system (1.2)–(1.6) that have been established in [16]. The subsequent Sect. 3 brings a detailed analysis of the deep quench approximation. Section 4 is then devoted to the derivation of first-order necessary optimality conditions for the case \(F_1=I_{[-1,1]}\). In the final Sect. 5, we investigate sparsity properties of the optimal controls for the double obstacle case.

Throughout the paper, we make repeated use of Hölder’s inequality, of the elementary Young inequality

$$\begin{aligned} a b\,\le \delta |a|^2+\frac{1}{4\delta }|b|^2\quad \forall \,a,b\in {\mathbb {R}}, \quad \forall \,\delta >0, \end{aligned}$$

as well as the continuity of the embeddings \(H^1(\Omega )\subset L^p(\Omega )\) for \(1\le p\le 6\) and \(H^{2}(\Omega )\subset C^0({\overline{\Omega }})\).

2 General Setting and Properties of the Control-to-State Operator

In this section, we introduce the general setting of our control problem and state some results on the state system (1.2)–(1.6) that in the present form have been established in [16]. For similar results, we also refer to the papers [15, 47].

To begin with, for a Banach space \(\,X\,\) we denote by \(\Vert \cdot \Vert _X\) the norm in the space X or in a power thereof, by \(\,X^*\,\) its dual space, and with \(\langle \ , \cdot \ , , \ , \cdot \ , \rangle X \) the duality pairing between \(X^*\) and X. For any \(1 \le p \le \infty \) and \(k \ge 0\), we denote the standard Lebesgue and Sobolev spaces on \(\Omega \) by \(L^p(\Omega )\) and \(W^{k,p}(\Omega )\), and the corresponding norms by \(\mathopen \Vert \,\cdot \,\mathclose \Vert _{L^p(\Omega )}=\mathopen \Vert \,\cdot \,\mathclose \Vert _{p}\) and \(\mathopen \Vert \,\cdot \,\mathclose \Vert _{W^{k,p}(\Omega )}\), respectively. For the case \(p = 2\), these become Hilbert spaces and we employ the standard notation \(H^k(\Omega ) := W^{k,2}(\Omega )\). As usual, for Banach spaces \(\,X\,\) and \(\,Y\,\) we introduce the linear space \(\,X\cap Y\,\) which becomes a Banach space when equipped with its natural norm \(\,\Vert v\Vert _{X \cap Y}:= \Vert v\Vert _X\,+\,\Vert v\Vert _Y\), for \(\,v\in X\cap Y\). Moreover, we recall the definition (1.8) of the control space \({{\mathcal {U}}}\) and introduce the spaces

$$\begin{aligned}&H := L^{2}(\Omega ), \quad V := H^{1}(\Omega ), \quad W_{0} := \{v\in H^{2}(\Omega ): \ \partial _\mathbf{n}v=0 \,\hbox { on} \,\Gamma \}. \end{aligned}$$

Furthermore, by \((\cdot ,\cdot )\) and \(\Vert \,\cdot \,\Vert \) we denote the standard inner product and related norm in H, and for simplicity we also set \(\langle \ , \cdot \ , , \ , \cdot \ , \rangle : =\langle \ , \cdot \ , , \ , \cdot \ , \rangle V\).

We make the following assumptions on the data of the system.


\(\alpha ,\beta \), and \(\chi \) are positive constants.


\(F=F_1+F_2\), where \(F_1:{\mathbb {R}}\rightarrow [0,+\infty ]\) is convex and lower semicontinuous with \(\,F_1(0)=0\), and where \(F_2 \in C^5({\mathbb {R}})\) has a Lipschitz continuous derivative \(F'_2\).


\(P ,\mathbbm {h}\in C^3({\mathbb {R}})\cap W^{3,\infty }({\mathbb {R}})\) are nonnegative and bounded.


With fixed given constants \({\underline{u}}_i, (\widehat{u}_i)\) satisfying \({\underline{u}}_i< (\widehat{u}_i)\), \(i=1,2\), we have

$$\begin{aligned}&{{\mathcal {U}}}_{\mathrm{ad}}=\left\{ \mathbf{u}=(u_1,u_2)\in {L^\infty (Q)}^2: \underline{u}_i\le u_i\le (\widehat{u}_i)\,\hbox { a.e. in }\,Q\,\hbox { for }\,i=1,2\right\} . \end{aligned}$$

Observe that (A3) implies that the functions \(P,P',P'',\mathbbm {h},\mathbbm {h}'\), and \(\mathbbm {h}''\) are Lipschitz continuous on \({\mathbb {R}}\). Let us also note that both \(F_1=F_{\mathrm{1,log}}\) and \(F_1=I_{[-1,1]}\) are admissible for (A2). Moreover, (A2) implies that the subdifferential \(\partial F_1\) of \(F_1\) is a maximal monotone graph in \({\mathbb {R}}\times {\mathbb {R}}\) with effective domain \(D(\partial F_1 ) {{}\subseteq {}} D(F_1 ) \); since \(F_1\) attains its minimum value 0 at 0, it also turns out that \(0\in D(\partial F_1 )\) and \(0\in \partial F_1(0)\).

The following additional condition for the nonlinearity \(F_1\) will be considered later, when discussing about strong solutions.


 There exists an interval \(\,(r_-,r_+)\,\) with \(\,-\infty \le r_-<0<r_+\le +\infty \,\) such that the restriction of \(F_1\) to \(\,(r_-,r_+)\,\) belongs to \(\,C^5(r_-,r_+)\) and such that

$$\begin{aligned} \lim _{r\searrow \, { r_-}}F_1'(r)=-\infty , \quad \lim _{r\nearrow \, {r_+}} F_1'(r)=+\infty . \end{aligned}$$

Moreover, for the optimal control application that will be analyzed successively, we make the following general assumptions:


  The constants \(b_1,b_2,\kappa \) are nonnegative, and \(b_0\) is positive.


  It holds \({\widehat{\varphi }}_\Omega \in L^{2}(\Omega )\) and \({\widehat{\varphi }}_Q\in L^2(Q)\).


  \(g:L^2(Q)^2\rightarrow {\mathbb {R}}\) is nonnegative, continuous and convex on \(L^2(Q)^2\).

Next, we introduce our notion of (weak) solution to the state system (1.2)–(1.6).

Definition 2.1

A quadruplet \((\mu ,\varphi ,\xi ,\sigma )\) is called a weak solution to the initial-boundary value problem (1.2)–(1.6) if

$$\begin{aligned} \varphi\in & {} H^{1}(0,T;H) \cap L^\infty (0,T; V ) \cap L^2(0,T; W_0), \end{aligned}$$
$$\begin{aligned} \mu ,\sigma\in & {} H^{1}(0,T;V^*) \cap L^\infty (0,T; H ) \cap L^2(0,T; V), \end{aligned}$$
$$\begin{aligned} \xi\in & {} L^2(0,T; H ), \end{aligned}$$

and if \((\mu ,\varphi ,\xi ,\sigma )\) satisfies the corresponding weak formulation given by

$$\begin{aligned}&\mathopen \langle \partial _t(\alpha \mu + \varphi ), v \mathclose \rangle + \int _\Omega \nabla \mu \cdot \nabla v = \int _\Omega P(\varphi )(\sigma +\chi (1-\varphi )-\mu )v -\int _\Omega \mathbbm {h}(\varphi )u_1 v \nonumber \\&\qquad \hbox {for every} \,\, v \in V \,\, \hbox {and {almost everywhere} in} \,\, (0,T), \end{aligned}$$
$$\begin{aligned}&\beta \partial _t\varphi -\Delta \varphi +\xi +F_2'(\varphi )=\mu +\chi \,\sigma , \quad {\xi \in \partial F_1(\varphi )}, \,\, \hbox {a.e. in} \,\, Q, \end{aligned}$$
$$\begin{aligned}&\mathopen \langle \partial _t\sigma , v\mathclose \rangle + \int _\Omega \nabla \sigma \cdot \nabla v =\chi \int _\Omega \nabla \varphi \cdot \nabla v - \int _\Omega P(\varphi )(\sigma +\chi (1-\varphi )-\mu )v + \int _\Omega u_2 v \nonumber \\&\qquad \hbox {for every} \,\, v \in V \,\, \hbox {and {almost everywhere} in} \,\, (0,T), \end{aligned}$$
$$\begin{aligned}&\mu (0)=\mu _0, \quad \varphi (0)=\varphi _0, \quad \sigma (0)=\sigma _0 \quad \hbox {a.e. in} \,\, \Omega . \end{aligned}$$

Observe that the homogeneous Neumann boundary conditions (1.5) are encoded in the condition (2.3) for \(\varphi \) (by the definition of the space \(W_0\)) and in the variational equalities (2.6) and (2.8) for \(\mu \) and \(\sigma \), by the use of the forms \(\int _\Omega \nabla \mu \cdot \nabla v \) and \(\int _\Omega \nabla \sigma \cdot \nabla v \). Moreover, let us point out that at this level the control pair \((u_1,u_2)\) just plays the role of two fixed forcing terms in (2.6) and (2.8). Let us also mention that the initial conditions (2.9) are meaningful since (2.3) and (2.4) ensure that \(\varphi \in C^0([0,T];V)\) and \(\mu , \sigma \in C^0([0,T];H)\).

The following result is a special case of [16, Thm. 2.2].

Theorem 2.2

Assume that (A1)–(A3) are fulfilled, let the initial data satisfy

$$\begin{aligned} \mu _0, \sigma _0 \in L^{2}(\Omega ), \quad \varphi _0 \in H^{1}(\Omega ), \quad F_1(\varphi _0) \in L^{1}(\Omega ), \end{aligned}$$

and suppose that

$$\begin{aligned} (u_1, u_2) \in L^2(Q) \times L^2(Q). \end{aligned}$$

Then there exists at least one solution \((\mu ,\varphi ,\xi ,\sigma )\) in the sense of Definition 2.1. Moreover, if \(u_1 \in L^\infty (Q)\) then there is only one such solution.

Observe that the above well-posedness result is valid also for the case when \(F_1=I_{[-1,1]}\). This is not the case for the next result concerning the existence of strong solutions, which however applies to the logarithmic case \(F_1=F_{\mathrm{1,log}}\). For this purpose, we take in charge assumption (A5) and remark that the regularity postulated for the potential \(F_1\) entails that its derivative can be defined in the classical manner in \((r_-,r_+)\), so that we will no longer need to consider a selection \(\xi \) in the notion of strong solution below. Moreover, it will be useful to fix once and for all some \(R>0\) such that

$$\begin{aligned} {{\mathcal {U}}}_R:=\left\{ \mathbf{u}=(u_1,u_2)\in L^\infty (Q)^2:\,\Vert \mathbf{u}\Vert _{L^\infty (Q)^2}<R\right\} \supset {{\mathcal {U}}}_{\mathrm{ad}}. \end{aligned}$$

We then have the following well-posedness result for the state system (where the equations and conditions are fulfilled almost everywhere in Q), which has been proved in [16, Theorem 2.3]:

Theorem 2.3

Suppose that the conditions (A1)–(A5) and (2.12) are fulfilled, and let the initial data satisfy the conditions

$$\begin{aligned}&\mu _0,\sigma _0 \in H^{1}(\Omega ) \cap L^{\infty }(\Omega ), \quad \varphi _0 \in { W_0 ,} \end{aligned}$$
$$\begin{aligned}&r_-<\min _{x\in \overline{\Omega }}\,\varphi _0(x) \le \max _{x\in \overline{\Omega }}\,\varphi _0(x)<r_+. \end{aligned}$$

Then the state system (1.2)–(1.6) has for every \(\mathbf{u}=(u_1,u_2)\in {{\mathcal {U}}}_{R}\) a unique strong solution \((\mu ,\varphi ,\sigma )\) with the regularity

$$\begin{aligned}&\mu \in H^1(0,T;H) \cap C^0([0,T];V) \cap L^2(0,T;W_0)\cap L^\infty (Q), \end{aligned}$$
$$\begin{aligned}&\varphi \in W^{1,\infty }(0,T;H)\cap H^1(0,T;V)\cap L^\infty (0,T;W_0) \cap C^0({\overline{Q}}), \end{aligned}$$
$$\begin{aligned}&\sigma \in H^1(0,T;H)\cap C^0([0,T];V)\cap L^2(0,T;W_0)\cap L^\infty (Q). \end{aligned}$$

Moreover, there is a constant \(K_1>0\), which depends on \(\Omega ,T,R,\alpha ,\beta \) and the data of the system, but not on the choice of \(\mathbf{u}\in {{\mathcal {U}}}_{R}\), such that

$$\begin{aligned}&\Vert \mu \Vert _{H^1(0,T;H) \cap C^0([0,T];V) \cap L^2(0,T;W_0)\cap L^\infty (Q)}\nonumber \\&+\,\Vert \varphi \Vert _{W^{1,\infty }(0,T;H)\cap H^1(0,T;V) \cap L^\infty (0,T;W_0)\cap C^0({\overline{Q}})} \nonumber \\&+\,\Vert \sigma \Vert _{H^1(0,T;H) \cap C^0([0,T];V) \cap L^2(0,T;W_0)\cap L^\infty (Q)}\,\le \,K_1 . \end{aligned}$$

Furthermore, there are constants \(r_*,r^*\), which depend on \(\Omega ,T,R,\alpha ,\beta \) and the data of the system, but not on the choice of \(\mathbf{u}\in {{\mathcal {U}}}_{R}\), such that

$$\begin{aligned} r_-<r_*\le \varphi (x,t)\le r^*<r_+ \quad \hbox {for all}\,\, (x,t)\in {\overline{Q}}. \end{aligned}$$

Also, there is some constant \(K_2>0\) having the same dependencies as \(K_1\) such that

$$\begin{aligned}&\max _{i=0,1,2,3}\,\left\| P^{(i)}(\varphi )\right\| _{L^\infty (Q)}\, + \max _{i=0,1,2,3}\,\left\| \mathbbm {h}^{(i)}(\varphi )\right\| _{L^\infty (Q)}\, \nonumber \\&\qquad +\,\max _{i=0,1,2,3,4{,5}}\,\left\| F^{(i)}(\varphi )\right\| _{L^\infty (Q)} \,\le \,K_2 . \end{aligned}$$

Remark 2.4

Condition (2.19), known as the separation property, is especially relevant for the case of singular potentials (such as \(F_1=F_{\mathrm{1,log}}\)). Indeed, it guarantees that the phase variable \(\varphi \) always stays away from the critical values \(\,r_-,r_+\) that may correspond to the pure phases. Hence, the singularity of the potential is no longer an obstacle for the analysis as the values of \(\varphi \) range in some interval in which \(F_1\) is smooth.

Owing to Theorem 2.3, the control-to-state operator

$$\begin{aligned} \, {{\mathcal {S}}}:\mathbf{u}=(u_1,u_2)\mapsto (\mu ,\varphi ,\sigma )\, \end{aligned}$$

is well defined as a mapping between \({{\mathcal {U}}}=L^\infty (Q)^2\) and the Banach space specified by the regularity results (2.15)–(2.17). We now discuss its differentiability properties. The results obtained are originally due to [47] and have been slightly generalized in [16] to the version reported here. For this purpose, some functional analytic preparations are in order. We first define the linear spaces

$$\begin{aligned} {{\mathcal {X}}}\,&:=\,X\times {\widetilde{X}}\times X, \,\,\, \hbox {where }\nonumber \\ X\,&:=\,H^1(0,T;H)\cap {{}L^\infty (0,T; V) {}}\cap L^2(0,T;W_0)\cap L^\infty (Q), \nonumber \\ {\widetilde{X}}\,&:=W^{1,\infty }(0,T;H)\cap H^1(0,T;V)\cap L^\infty (0,T;W_0)\cap C^0({\overline{Q}}), \end{aligned}$$

which are Banach spaces when endowed with their natural norms. Next, we introduce the linear space

$$\begin{aligned}&{{\mathcal {Y}}}\,:=\,\bigl \{(\mu ,\varphi ,\sigma )\in {\mathcal {X}}: \,\alpha \partial _t\mu +\partial _t\varphi -\Delta \mu \in {L^\infty (Q)}, \,\,\,\beta \partial _t\varphi -\Delta \varphi -\mu \in {L^\infty (Q)},\\&\partial _t\sigma -\Delta \sigma +\chi \Delta \varphi \in {L^\infty (Q)}\bigr \}, \end{aligned}$$

which becomes a Banach space when endowed with the norm

$$\begin{aligned} \Vert (\mu ,\varphi ,\sigma )\Vert _{{{\mathcal {Y}}}}\,:=\,&\Vert (\mu ,\varphi ,\sigma )\Vert _{\mathcal {X}} \,+\,\Vert \alpha \partial _t\mu +\partial _t\varphi -\Delta \mu \Vert _{{L^\infty (Q)}} \,+\,\Vert \beta \partial _t\varphi -\Delta \varphi -\mu \Vert _{{L^\infty (Q)}}\\&+\,\Vert \partial _t\sigma -\Delta \sigma +\chi \Delta \varphi \Vert _{{L^\infty (Q)}} . \end{aligned}$$

Finally, we put

$$\begin{aligned}&Z:=H^1(0,T;H)\cap L^\infty (0,T;V)\cap L^2(0,T;W_0),\\&{{\mathcal {Z}}}:= Z\times {\widetilde{X}}\times Z. \end{aligned}$$

Now suppose that \(\overline{\mathbf{u}}=(\overline{u}_1,\overline{u}_2)\in {{\mathcal {U}}}_{R}\) is arbitrary and that \(({\overline{\mu }},{\overline{\varphi }},{\overline{\sigma }})={{\mathcal {S}}}(\overline{\mathbf{u}})\). We then consider the linearization of the state system at \(((\overline{u}_1,\overline{u}_2),({\overline{\mu }},{\overline{\varphi }},{\overline{\sigma }}))\) given by the following linear initial-boundary value problem:

$$\begin{aligned}&\alpha \partial _t\eta +\partial _t\rho -\Delta \eta \,=\,P({\overline{\varphi }})(\zeta -\chi \rho -\eta )+P'({\overline{\varphi }})({\overline{\sigma }}+\chi (1-{\overline{\varphi }})-{\overline{\mu }})\rho - \mathbbm {h}'({\overline{\varphi }})\,\overline{u}_1\,\rho \nonumber \\&-\mathbbm {h}({\overline{\varphi }}){h_1}\quad \hbox {in }\,Q, \end{aligned}$$
$$\begin{aligned}&\beta \partial _t\rho -\Delta \rho -\eta \,=\,\chi \,\zeta -F''({\overline{\varphi }})\rho \quad \hbox { in} \,\, Q, \end{aligned}$$
$$\begin{aligned}&\partial _t\zeta -\Delta \zeta +\chi \Delta \rho \,=\,-P({\overline{\varphi }})(\zeta -\chi \rho -\eta )- P'({\overline{\varphi }})({\overline{\sigma }}+\chi (1-{\overline{\varphi }})-{\overline{\mu }})\rho \nonumber \\&+h_2 \quad \hbox { in} \,Q, \end{aligned}$$
$$\begin{aligned}&\partial _\mathbf{n}\eta \,=\,\partial _\mathbf{n}\rho \,=\,\partial _\mathbf{n}\zeta \,=\,0 \quad \hbox { on} \,\Sigma , \end{aligned}$$
$$\begin{aligned}&\eta (0)\,=\,\rho (0)\,=\,\zeta (0)\,=\,0\,\,\,\hbox { in}\,\Omega . \end{aligned}$$

According to [16, Lem. 4.1] and its proof (see, in particular, Remark 4.2 and Eqs. (4.37)–(4.39) in [16]), we have the following:

$$\begin{aligned}&\hbox {The linear system}\,\,(2.22){-}(2.26) \,\, \hbox {has for every}\,\, \mathbf{h}=(h_1,h_2) \in L^2(Q)^2 \,\, \hbox {a unique solution } \nonumber \\&{(\eta ,\rho ,\zeta )\in {{\mathcal {Z}}}}, \,\, \hbox {and the linear mapping} \,\, \mathbf {h}\mapsto (\eta ,\rho ,\zeta ) \,\, \hbox {belongs to}\,\, {\mathcal L}(L^2(Q)^2,{{\mathcal {Z}}}). \end{aligned}$$
$$\begin{aligned}&\hbox {The linear system}\,\, (2.22){-}(2.26) \,\, \hbox {has for every} \,\, \mathbf {h}={(h_1,h_2)}\in {L^\infty (Q)}^2 \,\, \hbox {a unique solution}\nonumber \\&{(\eta ,\rho ,\zeta )\in {{\mathcal {Y}}}}, \,\, \hbox {and the linear mapping} \,\, \mathbf {h}\mapsto (\eta ,\rho ,\zeta ) \,\, \hbox {belongs to} \,\, {\mathcal L}({L^\infty (Q)}^2,{{\mathcal {Y}}}). \end{aligned}$$

Moreover, we have the following differentiability result (see [16, Thm. 4.4]):

Theorem 2.5

Suppose that the conditions (A1)–(A5) and (2.12) are fulfilled, let the initial data \((\mu _0,\varphi _0,\sigma _0)\) satisfy (2.13) and (2.14), and assume that \({\overline{\mathbf{u}}}=(\overline{u}_1,\overline{u}_2)\in {{\mathcal {U}}}_R\) is arbitrary and \(({\overline{\mu }},{\overline{\varphi }},{\overline{\sigma }})={{\mathcal {S}}}({\overline{\mathbf{u}}})\). Then the control-to-state operator \({{\mathcal {S}}}\) is twice continuously Fréchet differentiable at \(\,{\overline{\mathbf{u}}}\,\) as a mapping from \(\,{{\mathcal {U}}}\,\) into \(\,{{\mathcal {Y}}}\). Moreover, for every \({ \mathbf{h}=(h_1,h_2)}\in {{\mathcal {U}}}\), the Fréchet derivative \(\,D {{\mathcal {S}}}({\overline{\mathbf{u}}})\in {{\mathcal {L}}}({{\mathcal {U}}},{{\mathcal {Y}}})\,\) of \(\,{{\mathcal {S}}}\,\) at \(\,{\overline{\mathbf{u}}}\,\) is given by the identity \(\,D{{\mathcal {S}}}({\overline{\mathbf{u}}}){(\mathbf{h})}=(\eta ,\rho ,\zeta )\), where \((\eta ,\rho ,\zeta )\) is the unique solution to the linear system (2.22)–(2.26).

Remark 2.6

As \(L^\infty (Q)^2\) is densely embedded in \(L^2(Q)^2\), the Fréchet derivative \(\,D {{\mathcal {S}}}({\overline{\mathbf{u}}})\), which by virtue of the continuity of the embedding \({{\mathcal {Y}}}\subset {{\mathcal {Z}}}\) also belongs to the space \({{\mathcal {L}}}({L^\infty (Q)}^2,{{\mathcal {Z}}})\), can be continuously extended to a linear operator in \({{\mathcal {L}}}(L^2(Q)^2,{\mathcal Z})\), which we still denote by \(D {{\mathcal {S}}}({\overline{\mathbf{u}}})\). It then follows from (2.27) that also for \(\mathbf{h}=(h_1,h_2) \in L^2(Q)^2\) the identity \(\,D{{\mathcal {S}}}({\overline{\mathbf{u}}}){(\mathbf{h})}=(\eta ,\rho ,\zeta )\) is valid.

Remark 2.7

For the explicit form of the second-order Fréchet derivative \(D^2 {{\mathcal {S}}}({\overline{\mathbf{u}}}) \in {{\mathcal {L}}}({{\mathcal {U}}},\mathcal L({{\mathcal {U}}},{{\mathcal {Y}}})),\) we refer the reader to [16, Thm. 4.8].

3 Deep Quench Approximation of the State System

In this section, we discuss the deep quench approximation of the state system (1.2)–(1.6), where we generally assume that the conditions (A1)–(A4) and (2.12)–(2.13) are fulfilled and that (2.14) is satisfied with \((r_-,r_+)=(-1,1)\). We now consider the state system (1.2)–(1.6) for the cases \(F_1=I_{[-1,1]}\) and \(F_1=F_{1,\gamma }\) (\(\gamma \in (0,1]\)), respectively. Since the logarithmic functions \(F_{1,\gamma }\) satisfy the condition (A5), the state system (1.2)–(1.6) has by Theorem 2.3 for every \(\mathbf{u}=(u_1,u_2)\in {{\mathcal {U}}}_{R}\) and \(F_1=F_{1,\gamma }\), \(\gamma \in (0,1]\), a unique solution triplet \((\mu _\gamma ,\varphi _\gamma ,\sigma _\gamma )\) with the regularity specified by (2.15)–(2.17). By virtue of Theorem 2.2, there also exists a unique weak solution \((\mu ^0,{\varphi ^0}, \xi ^0, \sigma ^0)\) to the state system (2.6)–(2.9) for \(F_1=I_{[-1,1]}\) that enjoys the regularity specified by (2.3)–(2.5). Clearly, we must have

$$\begin{aligned} -1\le \varphi _\gamma \le 1\,\hbox { a.e. in} \,Q, \, \hbox {for all}\,\, \gamma \in (0,1], \, \hbox {and } \, -1\le \varphi ^0\le 1\, \hbox { a.e. in} \,Q. \end{aligned}$$

We introduce the corresponding solution operators

$$\begin{aligned}&{{\mathcal {S}}}_\gamma :{{\mathcal {U}}}_{R}\ni \mathbf{u}\mapsto {{\mathcal {S}}}_\gamma (\mathbf{u})=\big ({{\mathcal {S}}}_\gamma ^1(\mathbf{u}),{{\mathcal {S}}}_\gamma ^2(\mathbf{u}),{{\mathcal {S}}}_\gamma ^3(\mathbf{u})\big ):=(\mu _\gamma ,\varphi _\gamma ,\sigma _\gamma ) \quad \hbox {for}\,\, {\gamma \in (0,1]},\\&{{\mathcal {S}}}_0:{{\mathcal {U}}}_{R}\ni \mathbf{u}\mapsto {{\mathcal {S}}}_0(\mathbf{u})=\big ({{\mathcal {S}}}_0^1(\mathbf{u}),{{\mathcal {S}}}_0^2(\mathbf{u}), {{\mathcal {S}}}_0^3(\mathbf{u}),{{\mathcal {S}}}_0^4(\mathbf{u})\big ):=(\mu ^0,\varphi ^0,\xi ^0,\sigma ^0) . \end{aligned}$$

We are now going to investigate the behavior of the family \(\{(\mu _\gamma ,\varphi _\gamma ,\sigma _\gamma )\}_{\gamma >0}\) of deep quench approximations for \(\gamma \searrow 0\). We expect that the solution operator \({{\mathcal {S}}}_\gamma \) yields an approximation of \({{\mathcal {S}}}_0\) as \(\gamma \searrow 0.\) This is made rigorous though the following result.

Theorem 3.1

Suppose that the assumptions (A1)–(A4) and (2.12)–(2.14) are fulfilled, and let sequences \(\{\gamma _n\}\subset (0,1]\) and \(\{\mathbf{u}_n\}\subset {{\mathcal {U}}}_{\mathrm{ad}}\) be given such that \(\gamma _n\searrow 0\) and \(\mathbf{u}_n\rightarrow \mathbf{u}\) weakly-star in \({{\mathcal {U}}}\) as \(n\rightarrow \infty \) for some \(\mathbf{u}\in {{\mathcal {U}}}_{\mathrm{ad}}\). Moreover, let \((\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n})= {{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_n)\), \(n\in {\mathbb {N}}\), and \((\mu ^0,\varphi ^0,\xi ^0,\sigma ^0)={{\mathcal {S}}}_0(\mathbf{u})\). Then, as \(n\rightarrow \infty \),

$$\begin{aligned} \mu _{\gamma _n}\rightarrow \mu ^0&\quad \hbox {weakly-star in }\,X\,\hbox { and strongly in }\, C^0([0,T];H), \end{aligned}$$
$$\begin{aligned} \varphi _{\gamma _n}\rightarrow \varphi ^0&\quad \hbox {weakly-star in} \,{\widetilde{X}} \, \hbox {and strongly in } \,C^0({\overline{Q}}), \end{aligned}$$
$$\begin{aligned} F_{1,\gamma _n}'(\varphi _{\gamma _n})\rightarrow \xi ^0&\quad \hbox {weakly-star in }\,L^\infty (0,T;H), \end{aligned}$$
$$\begin{aligned} \sigma _{\gamma _n}\rightarrow \sigma ^0&\quad \hbox {weakly-star in }\,X\,\hbox { and strongly in }\, C^0([0,T];H), \end{aligned}$$

with the denotations introduced in (2.21).


The sequence \(\{\mathbf{u}_n\}\subset {{\mathcal {U}}}_{\mathrm{ad}}\) forms a bounded subset of \({\mathcal U}_R\). Now observe that the conditions (2.13) and (2.14) imply that there is some constant \(C_1>0\) such that

$$\begin{aligned} \Vert F_{1,\gamma }(\varphi _0)\Vert _{C^0({\overline{\Omega }})}\,+\,\Vert F_{1,\gamma }'(\varphi _0)\Vert _{ C^0({\overline{\Omega }})}\,\le \,C_1 \quad \forall \, \gamma \in (0,1]. \end{aligned}$$

Therefore, a closer inspection of the a priori estimates carried out in the proofs of [16, Thms. 2.2, 2.3] reveals that the estimates (2.18) and (2.20) (where \(F^{(i)} \) are replaced by \(F_2^{(i)} \)) hold uniformly for \(\gamma \in (0,1]\); in particular, the constant \(K_1\) introduced in Theorem 2.3 can be chosen in such a way that

$$\begin{aligned} \Vert \mu _\gamma \Vert _X\,+\,\Vert \varphi _\gamma \Vert _{{\widetilde{X}}} \,+\,\Vert \sigma _\gamma \Vert _X\,\le \,K_1 \quad \forall \,\gamma \in (0,1]. \end{aligned}$$

In addition, there is some \(C_2>0\) such that

$$\begin{aligned} \Vert F_{1,\gamma }'(\varphi _\gamma )\Vert _{L^\infty (0,T;H)}\,\le \,C_2\quad \forall \,\gamma \in (0,1]. \end{aligned}$$

Therefore, there are limits \((\mu ,\varphi ,\xi ,\sigma )\) and a subsequence of \(\{(\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n})\}\), which for convenience is again indexed by n, such that, as \(n\rightarrow \infty \),

$$\begin{aligned} \mu _{\gamma _n}\rightarrow \mu&\quad \hbox {weakly-star in }\,X\,\hbox { and strongly in }\, C^0([0,T];H), \end{aligned}$$
$$\begin{aligned} \varphi _{\gamma _n}\rightarrow \varphi&\quad \hbox {weakly-star in }\,\widetilde{X}\,\hbox { and strongly in } \,C^0({\overline{Q}}), \end{aligned}$$
$$\begin{aligned} F_{1,\gamma _n}'(\varphi _{\gamma _n})\rightarrow \xi&\quad \hbox {weakly-star in }\,L^\infty (0,T;H), \end{aligned}$$
$$\begin{aligned} \sigma _{\gamma _n}\rightarrow \sigma&\quad \hbox {weakly-star in }\,X\,\hbox { and strongly in }\,{C^0([0,T];H)}. \end{aligned}$$

Here, the strong convergence results follow from well-known compactness results (see, e.g., [46, Sect. 8, Cor. 4]). We then have to show that \((\mu ,\varphi ,\xi ,\sigma )\) is a solution to (2.6)–(2.9) in the sense of Theorem 2.2 for \(F_1=I_{[-1,1]}\) and control \(\,\mathbf{u}\). To this end, we pass to the limit as \(n\rightarrow \infty \) in the system (2.6)–(2.9), written for \(F_1=F_{1,\gamma _n}\) and \(\mathbf{u}=\mathbf{u}_n\), for \(n\in {\mathbb {N}}\). In view of the strong convergence properties stated in (3.7), (3.8), and (3.10), it is easily seen that \((\mu ,\varphi ,\sigma )\) fulfills the initial conditions in (2.9). Moreover, owing to the Lipschitz continuity of \(P,\mathbbm {h},F_2'\) and the strong convergence in (3.8), we conclude that

$$\begin{aligned} P(\varphi _{\gamma _n})\rightarrow P(\varphi ),\quad \mathbbm {h}(\varphi _{\gamma _n})\rightarrow \mathbbm {h}(\varphi ),\quad F_2'(\varphi _{\gamma _n})\rightarrow F_2'(\varphi ), \quad \hbox {all strongly in}\,C^0({\overline{Q}}). \end{aligned}$$

Using this and (3.7)–(3.10) once more, we obtain by passage to the limit as \(n\rightarrow \infty \) that \((\mu ,\varphi ,\xi ,\sigma )\) satisfies the time-integrated version of the variational equalities (with test functions \(v\in L^2(0,T;V)\)) stated in (2.6)–(2.8). Notice that this time-integrated version of the variational equalities is equivalent to them.

It remains to show that \(\xi \in \partial I_{[-1,1]}(\varphi )\) almost everywhere in \(\,Q\). To this end, we define on \(L^2(Q)\) the convex functional

$$\begin{aligned} \Phi (v)=\iint _Q I_{[-1,1]}(v), \quad \hbox {if} \,I_{[-1,1]}(v)\in L^1(Q),\, \,\,\hbox {and} \,\Phi (v)=+\infty , \,\hbox {otherwise}. \end{aligned}$$

It then suffices to show that \(\xi \) belongs to the subdifferential of \(\,\Phi \,\) at \(\,\varphi \), i.e., that

$$\begin{aligned} \Phi (v)-\Phi (\varphi )\,\ge \,\iint _Q \xi \,(v-\varphi )\quad \forall \,v\in L^2(Q). \end{aligned}$$

At this point, we recall (2.19) which yields that \(\varphi _{\gamma _n}(x,t)\in [-1,1]\) on \({\overline{Q}}\). Hence, by (3.8), also \(\varphi (x,t)\in [-1,1]\) on \({\overline{Q}}\), and thus \(\,\Phi (\varphi )=0\). Now observe that in case that  \(\Phi (v)\not \in L^1(Q)\)  the inequality (3.12) holds true since its left-hand side is infinite. If, however, \(\Phi (v)\in L^1(Q)\), then obviously \(v(x,t)\in [-1,1]\) almost everywhere in \(\,Q\), and by virtue of (1.12) and (1.13) it follows from Lebesgue’s dominated convergence theorem that

$$\begin{aligned} \lim _{n\rightarrow \infty }\iint _Q F_{1,\gamma _n}(v)= \Phi (v)=0. \end{aligned}$$

Now, by the convexity of \(F_{1,\gamma _n}\), and since \(F_{1,\gamma _n}(\varphi _{\gamma _n})\) is nonnegative, for all \(v\in L^2(Q)\) we have that

$$\begin{aligned} F_{1,\gamma _n}'(\varphi _{\gamma _n})(v-\varphi _{\gamma _n})\,\le \,F_{1,\gamma _n}(v)-F_{1,\gamma _n}(\varphi _{\gamma _n})\,\le \,F_{1,\gamma _n}(v) \quad \hbox {a.e. in} \,Q. \end{aligned}$$

Using (3.8) and (3.9), we thus obtain the following chain of (in)equalities:

$$\begin{aligned} \iint _Q \xi (v-\varphi )&=\lim _{n\rightarrow \infty }\iint _Q F_{1,\gamma _n}'(\varphi _{\gamma _n})(v-\varphi _{\gamma _n}) \,\le \,{\limsup _{n\rightarrow \infty }}\iint _Q\Bigl (F_{1,\gamma _n}(v)-F_{1,\gamma _n}(\varphi _{\gamma _n})\Bigr )\\&\le \lim _{n\rightarrow \infty }\iint _QF_{1,\gamma _n}(v)\,=\,\Phi (v)\,=\,\Phi (v)-\Phi (\varphi ), \end{aligned}$$

which shows the validity of (3.12). Hence, the quadruplet \((\mu ,\varphi ,\xi ,\sigma )\) is a solution to the state system in the sense of Definition 2.1 for \(F_1=I_{[-1,1]}\) and the control \(\mathbf{u}\). Since this solution is uniquely determined, we must have \((\mu ,\varphi ,\xi ,\sigma )= (\mu ^0,\varphi ^0,\xi ^0,\sigma ^0)={{\mathcal {S}}}_0(\mathbf{u})\). Finally, the uniqueness of the limit also entails that the convergence properties (3.7)–(3.10) are in fact valid for the entire sequence \((\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n})\) and not only for a subsequence. This concludes the proof of the assertion.

Remark 3.2

Note that the stronger conditions on the data required by (2.13)–(2.14) yield more regularity for the solution in the case \(F_1=I_{[-1,1]}\) with respect to the one obtained from Theorem 2.2. Indeed, we have

$$\begin{aligned} \mu \in X, \quad \varphi \in {\widetilde{X}},\quad \xi \in L^\infty (0,T;H), \quad \sigma \in X. \end{aligned}$$

Remark 3.3

The reader may wonder whether a result similar to Theorem 3.1 can be proved in the case when the additional assumptions (2.13)–(2.14) are not required for the initial data of the state system of (\({\mathcal {CP}}_0\)), i.e., of the problem (2.6)–(2.9) with \(F_1 = I_{[-1,1]}\). Indeed, we recall that Theorem 2.2 states existence and uniqueness of a weak solution to the problem provided the initial data just satisfy (2.10). Note that in this weaker setting the condition \(F_1 (\varphi _0) \in L^1(\Omega )\) entails that \(-1\le \varphi _0\le 1\) a.e. in \(\Omega .\) The answer to the above question is positive, but in this case the set of initial data \((\mu _0,\varphi _0,\sigma _0)\) should be approximated (as \(F_1\) is by \(F_{1,\gamma }\)) by a family \(\{(\mu _{0,\gamma },\varphi _{0,\gamma },\sigma _{0,\gamma })\}\) which does satisfy (2.13) and (2.14) for every \(\gamma \in (0,1]\) and converges to \((\mu _0,\varphi _0,\sigma _0)\) in some topology as \(\gamma \searrow 0\). We prove the existence of such a family, with precise statement and all needed conditions, in Lemma A.1 in the Appendix. About the convergence theorem alternative to Theorem 3.1, we point out that (3.2)–(3.5) would hold with the spaces X and \({\widetilde{X}}\) now replaced by (cf. (2.3)–(2.4))

$$\begin{aligned}&X_w \,:=\,H^1(0,T;V^*)\cap {L^\infty (0,T; H)}\cap L^2(0,T;V), \\&{\widetilde{X}}_w \, :=\, H^{1}(0,T;H) \cap L^\infty (0,T; V)\cap L^2(0,T;W_0)\cap L^\infty (Q), \end{aligned}$$

and with \(C^{0}([0,T];H)\) replaced by \(L^2 (0,T; H)\) in (3.2) and (3.5), \(C^0({\overline{Q}})\) by \(C^{0}([0,T];H)\) in (3.3) (and (3.11)), and \(L^\infty (0,T ; H)\) by \(L^2 (0,T; H)\) in (3.4). Moreover, if one wants to verify the subsequent theory in this weaker setting, it turns out it can be adapted without major modifications (see the subsequent Remark 5.1).

4 Existence and Approximation of Optimal Controls

Beginning with this section, we investigate the optimal control problem (\({\mathcal {CP}}_0\)) of minimizing the cost functional (1.1) over the admissible set \({{\mathcal {U}}}_{\mathrm{ad}}\) subject to state system (1.2)–(1.6) in the form (2.6)–(2.9) for \(F_1=I_{[-1,1]}\) under the additional assumptions (C1)–(C3). Observe that (C3) implies that \(\,g\,\) is weakly sequentially lower semicontinuous on \(L^2(Q)^2\). Moreover, denoting in the following by \(\,\partial \,\) the subdifferential mapping in \(L^2(Q)^2\), it follows from standard convex analysis that \(\,\partial g\,\) is defined on the entire space \(L^2(Q)^2\) and is a maximal monotone operator. In addition, the mapping \(((\mu ,\varphi ,\sigma ),\mathbf{u})\mapsto {{\mathcal {J}}}((\mu ,\varphi ,\sigma ),\mathbf{u})\) defined by the cost functional (1.1) is obviously continuous and convex (and thus weakly sequentially lower semicontinuous) on the space \(\bigl (L^2(Q)\times C^0([0,T];L^{2}(\Omega ))\times L^2(Q)\bigr ) \times L^2(Q)^2\).

In comparison with (\({\mathcal {CP}}_0\)), we consider for \(\gamma >0\) the following control problem:

(\({\mathcal {CP}}_\gamma \))   Minimize \(\,{\mathcal J}((\mu ,\varphi ,\sigma ),\mathbf{u})\,\) for \(\,\mathbf{u}\in {{\mathcal {U}}}_{\mathrm{ad}}\), subject to \((\mu ,\varphi ,\sigma ) ={{\mathcal {S}}}_\gamma (\mathbf{u})\).

We expect that the minimizers of (\({\mathcal {CP}}_\gamma \)) are for \(\gamma \searrow 0\) related to minimizers of (\({\mathcal {CP}}_0\)). Prior to giving an affirmative answer to this conjecture, we first show an existence result for (\({\mathcal {CP}}_\gamma \)).

Proposition 4.1

Suppose that (A1)–(A4), (C1)–(C3), and (2.12)–(2.14) are satisfied. Then (\({\mathcal C} {\mathcal P}_{\gamma }\)) has for every \(\gamma \in (0,1]\) a solution.


Let \(\gamma \in (0,1]\) be fixed, and assume that a minimizing sequence \(\,\{((\mu _n,\varphi _n,\sigma _n),\mathbf{u}_n)\}\) for (\({\mathcal {CP}}_\gamma \)) is given, where \(\mathbf{u}_n\in {{\mathcal {U}}}_{\mathrm{ad}}\) and \((\mu _n,\varphi _n,\sigma _n)={{\mathcal {S}}}_\gamma (\mathbf{u}_n)\) for all \(n\in {\mathbb {N}}\). Since \(\{\mathbf{u}_n\}\subset {{\mathcal {U}}}_{\mathrm{ad}}\), we may without loss of generality assume that \(\,\mathbf{u}_n\rightarrow \mathbf{u}\) weakly-star in \({{\mathcal {U}}}\) for some \(\mathbf{u}\in {{\mathcal {U}}}_{\mathrm{ad}}\). Moreover, by the general bound (2.18), there are a subsequence of \(\{(\mu _n,\varphi _n,\sigma _n)\}\,\) (which is again labeled by \(n\in {\mathbb {N}}\)) and limit points \(\mu ,\varphi ,\sigma \) such that (3.7), (3.8), (3.10), and (3.11) are valid with \((\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n})\) replaced by \((\mu _n,\varphi _n,\sigma _n)\). In addition, since \(\gamma >0\) is fixed, we conclude from (2.19) that there are constants \(r_*(\gamma ),r^*(\gamma )\) such that

$$\begin{aligned} -1<r_*(\gamma )\le \varphi _n\le r^*(\gamma )<1 \quad \hbox {on} \,\overline{Q} \,\hbox { for all}\,\, n\in {\mathbb {N}}, \end{aligned}$$

from which it also follows that \(\,F_{1,\gamma }'(\varphi _n)\rightarrow F_{1,\gamma }'(\varphi )\) uniformly in \({\overline{Q}}\) as \(n \rightarrow \infty \). We now write the state system (1.2)–(1.6) for \(F_1=F_{1,\gamma }\), \((\mu _n,\varphi _n,\sigma _n)\), \(\mathbf{u}_n = {(u_{n,1}, u_{n,2}){}}\), and pass to the limit as \(n\rightarrow \infty \), easily arriving at the conclusion that \((\mu ,\varphi ,\sigma ) ={{\mathcal {S}}}_\gamma (\mathbf{u})\). Thus, the pair \(((\mu ,\varphi ,\sigma ),\mathbf{u})\) is admissible for the minimization problem (\({\mathcal {CP}}_\gamma \)). The lower semicontinuity properties of the cost functional then yield that \(((\mu ,\varphi ,\sigma ),\mathbf{u})\) is a solution to (\({\mathcal {CP}}_\gamma \)).

Proposition 4.2

Suppose that (A1)–(A4), (C1)–(C3), and (2.12)–(2.14) are satisfied, and let sequences \(\,\{\gamma _n\}\subset (0,1]\,\) and \(\,\{\mathbf{u}_n\}\subset {{\mathcal {U}}}_{\mathrm{ad}}\,\) be given such that, as \(n\rightarrow \infty \), \(\,\gamma _n\searrow 0\,\) and \(\,\mathbf{u}_n\rightarrow \mathbf{u}\,\) weakly-star in \({\mathcal U}\) for some \(\,\mathbf{u}\in {{\mathcal {U}}}_{\mathrm{ad}}\). Then,

$$\begin{aligned}&{\mathcal {J}}( {{\mathcal {S}}}_0(\mathbf{u}),\mathbf{u})\,\le \,\liminf _{n\rightarrow \infty }\,{\mathcal {J}} ({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_n),\mathbf{u}_n), \end{aligned}$$
$$\begin{aligned}&{\mathcal {J}}({{\mathcal {S}}}_0({\mathbf {v}}),\mathbf {v})\,=\,\lim _{n\rightarrow \infty }\, {\mathcal {J}}({{\mathcal {S}}}_{\gamma _n}(\mathbf {v}),\mathbf {v}) \quad \forall \,\mathbf {v}\in {{\mathcal {U}}}_{\mathrm{ad}}. \end{aligned}$$


Theorem 3.1 yields that the component \(\varphi _{\gamma _n}\) of \({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_n)=(\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n})\) fulfills the convergence (3.3). The validity of (4.1) is then a direct consequence of the semicontinuity properties of the cost functional (1.1).

Now suppose that \(\mathbf {v}\in {{\mathcal {U}}}_{\mathrm{ad}}\) is arbitrarily chosen, and put \((\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n}):={{\mathcal {S}}}_{\gamma _n}(\mathbf {v})\) for all \(n\in {\mathbb {N}}\), as well as \((\mu ^0,\varphi ^0,\xi ^0,\sigma ^0):={{\mathcal {S}}}_0(\mathbf {v})\). Applying Theorem 3.1 with the constant sequence \(\mathbf{u}_n=\mathbf {v}\), \(n\in {\mathbb {N}}\), we see that (3.2)–(3.5) are valid once more. Since the first two summands of the cost functional are continuous with respect to the strong topology of \(C^0([0,T];H)\), we conclude the validity of (4.2).

We are now in a position to prove the existence of minimizers for the problem (\({\mathcal {CP}}_0\)). We have the following result.

Corollary 4.3

Suppose that (A1)–(A3), (C1)–(C3), and (2.12)–(2.14) are satisfied. Then the optimal control problem (\(\mathcal {CP}_0\)) has at least one solution.


Pick an arbitrary sequence \(\{\gamma _n\}\subset (0,1]\) such that \(\gamma _n\searrow 0\) as \(n\rightarrow \infty \). By virtue of Proposition 4.1, the optimal control problem (\({\mathcal {CP}}_{\gamma _n}\)) has for every \(n\in {\mathbb {N}}\) a solution \(((\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n}),\mathbf{u}_ {\gamma _n})\) where \((\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n})={{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n})\) for \(n\in {\mathbb {N}}\). Since \({{\mathcal {U}}}_{\mathrm{ad}}\) is bounded in \({{\mathcal {U}}}\), we may without loss of generality assume that \(\mathbf{u}_{\gamma _n}\rightarrow \mathbf{u}\) weakly-star in \({{\mathcal {U}}}\) for some \(\mathbf{u}\in {{\mathcal {U}}}_{\mathrm{ad}}\). We then obtain from Theorem 3.1 that (3.2)–(3.5) hold true with \((\mu ^0,\varphi ^0,\xi ^0,\sigma ^0)={{\mathcal {S}}}_0(\mathbf{u})\). Invoking the optimality of \(((\mu _{\gamma _n},\varphi _{\gamma _n},\sigma _{\gamma _n}),\mathbf{u}_{\gamma _n})\) for (\({\mathcal {CP}}_{\gamma _n}\)), we then find from Proposition 4.2 for every \(\,\mathbf {v}\in {{\mathcal {U}}}_{\mathrm{ad}}\,\) the chain of (in)equalities

$$\begin{aligned}&{{\mathcal {J}}}({{\mathcal {S}}}_0(\mathbf{u}),\mathbf{u})\,\le \,\liminf _{{n}\rightarrow \infty }\, {\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}),\mathbf{u}_{\gamma _n})\,\le \, \liminf _{{n}\rightarrow \infty }\,{\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf {v}),\mathbf {v}) \,=\,{\mathcal J}({{\mathcal {S}}}_0(\mathbf {v}),\mathbf {v} ), \end{aligned}$$

which yields that \(\,({{\mathcal {S}}}_0(\mathbf{u}),\mathbf{u})\,\) is an optimal pair for (\({\mathcal {CP}}_0\)). The assertion is thus proved.

Theorem 3.1 and the proof of Corollary 4.3 indicate that optimal controls of (\({\mathcal {CP}}_\gamma \)) are “close” to optimal controls of (\({\mathcal {CP}}_0\)) as \(\gamma \) approaches zero. However, they do not yield any information on whether every optimal control of (\({\mathcal {CP}}_0\)) can be approximated in this way. In fact, such a global result cannot be expected to hold true. Nevertheless, a local answer can be given by employing a well-known trick. To this end, let \(\overline{\mathbf{u}}=(\overline{u}_1,\overline{u}_2)\in {{\mathcal {U}}}_{\mathrm{ad}}\) be an optimal control for (\({\mathcal {CP}}_0\)) with the associated state \({{\mathcal {S}}}_0(\overline{\mathbf{u}})\). We associate with this optimal control the adapted cost functional

$$\begin{aligned} \widetilde{{\mathcal {J}}}((\mu ,\varphi ,\sigma ),\mathbf{u}):= {{\mathcal {J}}}((\mu ,\varphi ,\sigma ),\mathbf{u})\,+\,\frac{1}{2}\,\Vert \mathbf{u}-\overline{\mathbf{u}}\Vert ^2_{L^2(Q)^2} \end{aligned}$$

and a corresponding adapted optimal control problem for \(\gamma >0\), namely:

(\(\widetilde{\mathcal {CP}}_{\gamma }\))   Minimize \(\,\, \widetilde{{\mathcal {J}}}((\mu ,\varphi ,\sigma ),\mathbf{u})\,\,\) for \(\,\mathbf{u}\in {{\mathcal {U}}}_{\mathrm{ad}}\), subject to the condition that \((\mu ,\varphi ,\sigma ) = {{\mathcal {S}}}_\gamma (\mathbf{u})\).

With essentially the same proof as that of Proposition 4.1 (which needs no repetition here), we can show the following result.

Lemma 4.4

Suppose that the assumptions of Proposition 4.1 are fulfilled. Then the adapted optimal control problem (\(\widetilde{{\mathcal {CP}}}_{\gamma }\)) has for every \(\gamma >0\) at least one solution.

We are now in the position to give a partial answer to the question raised above through the following result.

Theorem 4.5

Let the assumptions of Proposition 4.1 be fulfilled, suppose that \(\overline{\mathbf{u}}\in {{\mathcal {U}}}_{\mathrm{ad}}\) is an arbitrary optimal control of \(({\mathcal {CP}}_{0})\) with associated state \(({\overline{\mu }},{\overline{\varphi }},{\overline{\xi }},{\overline{\sigma }})={{\mathcal {S}}}_0(\overline{\mathbf{u}})\), and let \(\,\{\gamma _k\}_{k\in {\mathbb {N}}}\subset (0,1]\,\) be any sequence such that \(\,\gamma _k\searrow 0\,\) as \(\,k\rightarrow \infty \). Then there exist a subsequence \(\{\gamma _{n}\}\) of \(\{\gamma _k\}\), and, for every \(n\in {\mathbb {N}}\), an optimal control \(\,\mathbf{u}_{\gamma _{n}}\in {{\mathcal {U}}}_{\mathrm{ad}}\,\) of the adapted problem \((\widetilde{\mathcal {CP}}_{\gamma _{n}})\) with associated state \((\mu _{\gamma _{n}},\varphi _{\gamma _{n}},\sigma _{\gamma _n})={{\mathcal {S}}}_{\gamma _{n}} (\mathbf{u}_{\gamma _{n}})\), such that, as \(n\rightarrow \infty \),

$$\begin{aligned}&\mathbf{u}_{\gamma _{n}}\rightarrow \overline{\mathbf{u}}\quad \hbox {strongly in}\,L^2(Q)^2, \end{aligned}$$

and such that (3.2)–(3.5) hold true with \((\mu ^0,\varphi ^0,\xi ^0,\sigma ^0)\) replaced by \(({\overline{\mu }},{\overline{\varphi }},{\overline{\xi }},{\overline{\sigma }})\). Moreover, we have

$$\begin{aligned}&\lim _{n\rightarrow \infty }\,\widetilde{{\mathcal J}}({{\mathcal {S}}}_{\gamma _{n}}(\mathbf{u}_{\gamma _{n}}),\mathbf{u}_{\gamma _{n}}) \,=\,{\mathcal J}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}}). \end{aligned}$$


For any \( k\in {\mathbb {N}}\), we pick an optimal control \(\mathbf{u}_{\gamma _k} \in {{\mathcal {U}}}_{\mathrm{ad}}\,\) for the adapted problem (\(\widetilde{{\mathcal {CP}}}_{{\gamma }_k}\)) and denote by \((\mu _{\gamma _k},\varphi _{\gamma _k},\sigma _{\gamma _k})={{\mathcal {S}}}_{\gamma _k}(\mathbf{u}_{\gamma _k})\) the associated strong solution to the state system (1.2)–(1.6). By the boundedness of \({{\mathcal {U}}}_{\mathrm{ad}}\) in \({\mathcal {U}}\), there is some subsequence \(\{\gamma _{n}\}\) of \(\{\gamma _k\}\) such that

$$\begin{aligned} \mathbf{u}_{\gamma _{n}}\rightarrow \mathbf{u}\quad \hbox {weakly-star in}\,{{\mathcal {U}}} \quad \hbox {as }\,n\rightarrow \infty , \end{aligned}$$

for some \(\mathbf{u}\in {{\mathcal {U}}}_{\mathrm{ad}}\). Thanks to Theorem 3.1, the convergence properties (3.2)–(3.5) hold true correspondingly for the quadruple \((\mu ^0,\varphi ^0,\xi ^0,\sigma ^0)={{\mathcal {S}}}_0(\mathbf{u})\). In addition, the pair \(({{\mathcal {S}}}_0(\mathbf{u}),\mathbf{u})\) is admissible for (\({\mathcal C} {\mathcal P}_0\)).

We now aim at showing that \( \mathbf{u}=\overline{\mathbf{u}}\). Once this is shown, it follows from the unique solvability of the state system (2.6)–(2.9) that also \((\mu ^0,\varphi ^0,\xi ^0,\sigma ^0)= ({\overline{\mu }},{\overline{\varphi }}, {\overline{\xi }},{\overline{\sigma }})\). Now observe that, owing to the weak sequential lower semicontinuity of \(\widetilde{{\mathcal {J}}}\), and in view of the optimality property of \(({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}})\) for problem \(({\mathcal C} {\mathcal P}_0)\),

$$\begin{aligned} \liminf _{n\rightarrow \infty }\, \widetilde{\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}),\mathbf{u}_{\gamma _n})\,&\ge \,{\mathcal J}({{\mathcal {S}}}_0(\mathbf{u}),\mathbf{u})\,+\,\frac{1}{2}\, \Vert \mathbf{u}-\overline{\mathbf{u}}\Vert ^2_{L^2(Q)^2}\nonumber \\&\ge \, {\mathcal J}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}})\,+\,\frac{1}{2}\,\Vert \mathbf{u}-\overline{\mathbf{u}}\Vert ^2_{L^2(Q)^2} . \end{aligned}$$

On the other hand, the optimality property of \(\,({{\mathcal {S}}}_{\gamma _{n}}(\mathbf{u}_{\gamma _n}),\mathbf{u}_{\gamma _n}) \,\) for problem (\(\widetilde{{\mathcal {CP}}}_{{\gamma }_n}\)) yields that for any \(n\in {\mathbb {N}}\) we have

$$\begin{aligned} \widetilde{{\mathcal {J}}}({\mathcal S}_{\gamma _{n}}(\mathbf{u}_{\gamma _{n}}), \mathbf{u}_{\gamma _{n}})\,\le \,\widetilde{{\mathcal {J}}}({\mathcal S}_{\gamma _n} (\overline{\mathbf{u}}),\overline{\mathbf{u}})\,{{}=\, {{\mathcal {J}}}({\mathcal S}_{\gamma _n} (\overline{\mathbf{u}}),\overline{\mathbf{u}}) ,} \end{aligned}$$

whence, taking the limit superior as \(n\rightarrow \infty \) on both sides and invoking (4.2) in Proposition 4.2,

$$\begin{aligned} \limsup _{n\rightarrow \infty }\,\widetilde{\mathcal J}({{\mathcal {S}}}_{\gamma _{n}}(\mathbf{u}_{\gamma _n}), \mathbf{u}_{\gamma _n})&\le \,\limsup _{n\rightarrow \infty }\widetilde{\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\overline{\mathbf{u}}),\overline{\mathbf{u}}) \nonumber \\&=\,\limsup _{n\rightarrow \infty } {\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\overline{\mathbf{u}}),\overline{\mathbf{u}}) \,=\,{{\mathcal {J}}}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}}) . \end{aligned}$$

Combining (4.7) with (4.9), we have thus shown that \(\,\frac{1}{2}\,\Vert \mathbf{u}-\overline{\mathbf{u}}\Vert ^2_{L^2(Q)^2}=0\) , so that \(\,\mathbf{u}=\overline{\mathbf{u}}\,\) and thus also \(({\overline{\mu }},{\overline{\varphi }},{\overline{\xi }},{\overline{\sigma }}) =(\mu ^0,\varphi ^0,\xi ^0,\sigma ^0)\). Moreover, (4.7) and (4.9) also imply that

$$\begin{aligned} {{\mathcal {J}}}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}})&=\,\widetilde{\mathcal J}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}}) \,=\,\liminf _{n\rightarrow \infty }\, \widetilde{\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}), \mathbf{u}_{\gamma _{n}})\nonumber \\&\,=\,\limsup _{n\rightarrow \infty }\,\widetilde{\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}), {\mathbf{u}_{\gamma _n}}) \, =\,\lim _{n\rightarrow \infty }\, \widetilde{\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}), \mathbf{u}_{\gamma _n}) , \end{aligned}$$

which proves (4.5). Moreover, the convergence properties (3.2)–(3.5) are satisfied. On the other hand, we have that

$$\begin{aligned} {{\mathcal {J}}}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}}) \,&\le \,\liminf _{n\rightarrow \infty }\, {{\mathcal {J}}}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}), \mathbf{u}_{\gamma _{n}}) \,\le \,\limsup _{n\rightarrow \infty }\, {\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}), \mathbf{u}_{\gamma _{n}}) \nonumber \\&\le \,\limsup _{n\rightarrow \infty }\,\widetilde{\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}), {\mathbf{u}_{\gamma _n}}) \, =\,{\mathcal J}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}}) , \end{aligned}$$

so that also \( {\mathcal J}({{\mathcal {S}}}_{\gamma _n}(\mathbf{u}_{\gamma _n}),{\mathbf{u}_{\gamma _n}})\) converges to \( {{\mathcal {J}}}({{\mathcal {S}}}_0(\overline{\mathbf{u}}),\overline{\mathbf{u}})\) as \(n\rightarrow \infty \), and the relation in (4.3) enables us to infer (4.4).

5 First-order Necessary Optimality Conditions

We now derive first-order necessary optimality conditions for the control problem (\({\mathcal {CP}}_0\)), using the corresponding conditions for (\(\widetilde{{\mathcal {CP}}}_\gamma \)) as approximations. To this end, we generally assume that the conditions (A1)–(A4), (C1)–(C3), and (2.12)–(2.14) are fulfilled. Now let \(\overline{\mathbf{u}}\in {{\mathcal {U}}}_{\mathrm{ad}}\) be any fixed optimal control for (\({\mathcal {CP}}_0\)) with associated state \(({\overline{\mu }},{\overline{\varphi }},{\overline{\xi }},{\overline{\sigma }})={{\mathcal {S}}}_0(\overline{\mathbf{u}})\), and assume that \(\gamma \in (0, 1]\) is fixed. Moreover, assume that \(\overline{\mathbf{u}}_\gamma ={({\overline{u}}_{\gamma ,1}, {\overline{u}}_{\gamma ,2})} \in {{\mathcal {U}}}_{\mathrm{ad}}\) is an optimal control for (\(\widetilde{{\mathcal {CP}}}_\gamma \)) with corresponding state \(({\overline{\mu }}_\gamma ,{\overline{\varphi }}_\gamma ,{\overline{\sigma }}_\gamma )={{\mathcal {S}}}_\gamma (\overline{\mathbf{u}}_\gamma )\). Recalling (1.1) and (4.3), we then consider the reduced functionals

$$\begin{aligned}&G_1:{{\mathcal {U}}}_R\ni \mathbf{u}\mapsto {{\mathcal {J}}}_1({{\mathcal {S}}}_\gamma (\mathbf{u}),\mathbf{u})+\frac{1}{2} \,\Vert \mathbf{u}-\overline{\mathbf{u}}\Vert ^2_{L^2(Q)^2},\nonumber \\&\quad G:{{\mathcal {U}}}_{R}\ni \mathbf{u}\mapsto G_1(\mathbf{u})+\kappa \,g({\mathbf{u}}) . \end{aligned}$$

By Theorem 2.5 and the chain rule, \(G_1\) is Fréchet differentiable at \(\overline{\mathbf{u}}_\gamma \), and the Fréchet derivative \(DG_1(\overline{\mathbf{u}}_\gamma )\in {{\mathcal {L}}}({{\mathcal {U}}},{{\mathcal {Y}}})\) is given by

$$\begin{aligned} DG_1(\overline{\mathbf{u}}_\gamma )(\mathbf {h})=&\,b_1\iint _Q\bigl ({\overline{\varphi }}_\gamma -{\widehat{\varphi }}_Q \bigr )\,\rho ^{\mathbf {h}}_\gamma \,+\,b_2\int _\Omega \bigl ({\overline{\varphi }}_\gamma (T)-{\widehat{\varphi }}_\Omega \bigr )\,\rho ^{\mathbf {h}}_\gamma (T) \,+\,b_0\iint _Q \overline{\mathbf{u}}_\gamma \cdot \mathbf {h}\nonumber \\&+\iint _Q(\overline{\mathbf{u}}_\gamma -\overline{\mathbf{u}})\cdot \mathbf {h}, \end{aligned}$$

for every \(\mathbf {h}=(h_1,h_2)\) in \({{\mathcal {U}}}\). Here, the dot stands for the Euclidean inner product in \({\mathbb {R}}^2\), and \((\eta ^{\mathbf {h}}_\gamma ,\rho ^{\mathbf {h}}_\gamma ,\zeta ^{\mathbf {h}}_\gamma )\) denotes the unique solution to the linearized system (2.22)–(2.26) associated with \(\mathbf {h}=(h_1,h_2)\) and \(({\overline{\mu }},{\overline{\varphi }},{\overline{\sigma }})=({\overline{\mu }}_\gamma ,{\overline{\varphi }}_\gamma , {\overline{\sigma }}_\gamma )\).

As in Remark 2.6, it follows that \(DG_1(\overline{\mathbf{u}}_\gamma )\in {{\mathcal {L}}}({{\mathcal {U}}},{{\mathcal {Y}}})\) can be extended to a linear operator in \({{\mathcal {L}}}(L^2(Q)^2,{{\mathcal {Z}}})\), which is still denoted by \(DG_1(\overline{\mathbf{u}}_\gamma )\) and satisfies (5.2) for every \(\mathbf {h}=(h_1,h_2)\in L^2(Q)^2\).

Now, by arguing along the same lines as in the derivation of [47, Lem. 3.1], we conclude that there is some \(\overline{\varvec{\lambda }}_\gamma \in \partial g(\overline{\mathbf{u}}_\gamma ) \subset L^2(Q)^2\) such that the following variational inequality is satisfied:

$$\begin{aligned} DG_1(\overline{\mathbf{u}}_\gamma )(\mathbf {v}-\overline{\mathbf{u}}_\gamma )\,+\,\kappa \iint _Q \overline{\varvec{\lambda }}_\gamma \cdot (\mathbf {v}-\overline{\mathbf{u}}_\gamma )\,\ge \,0\quad \forall \,\mathbf {v}\in {{\mathcal {U}}}_{\mathrm{ad}}. \end{aligned}$$

As usual, we simplify (5.3) by means of the adjoint state variables \((p_\gamma ,q_\gamma , r_\gamma )\), which are defined as the solution triple (pqr) to the adjoint system which is formally given by the backward-in-time parabolic system

$$\begin{aligned}&- \partial _tp - \beta \partial _tq - \Delta q + \chi \Delta r + F_{1,\gamma }''({\overline{\varphi }}_\gamma )q+F_2''({{\overline{\varphi }}}_\gamma ) q + \mathbbm {h}'({{\overline{\varphi }}}_\gamma )\,{{\overline{u}}_{\gamma ,1}}\, p \nonumber \\&-P'({{\overline{\varphi }}}_\gamma )({{\overline{\sigma }}}_\gamma +\chi (1-{{\overline{\varphi }}}_\gamma )-{{\overline{\mu }}}_\gamma )(p-r) + \chi P({{\overline{\varphi }}}_\gamma )(p-r) ={{b_1}} ({{\overline{\varphi }}}_\gamma - {\widehat{\varphi }}_Q)&\hbox {in }\,Q , \end{aligned}$$
$$\begin{aligned}&-\alpha \partial _tp-\Delta p -q + P({{\overline{\varphi }}}_\gamma )(p-r)=0&\hbox {in }\,Q , \end{aligned}$$
$$\begin{aligned}&- \partial _tr -\Delta r - \chi q - P({{\overline{\varphi }}}_\gamma )(p-r) =0&\hbox {in }\,Q , \end{aligned}$$
$$\begin{aligned}&\partial _\mathbf{n}p=\partial _\mathbf{n}q=\partial _\mathbf{n}r=0&\hbox {on }\,\Sigma , \end{aligned}$$
$$\begin{aligned}&(p+\beta q)(T)= {{b_2}}({{\overline{\varphi }}}_\gamma (T)-\widehat{\varphi }_{\Omega }), \quad \alpha p(T)= 0,\quad r(T)=0&\hbox {in }\,\Omega . \end{aligned}$$

Let us point out that the terminal condition for \(\,p+\beta q\,\) prescribes a final datum that only belongs to \(L^{2}(\Omega )\). Therefore, the first Eq. (5.4) has to be understood in a weak sense. According to [16, Thm. 5.2], the adjoint system above admits, for every \(\gamma \), a unique solution \((p_\gamma ,q_\gamma ,r_\gamma )\) satisfying

$$\begin{aligned}&p_\gamma +\beta q_\gamma \in H^1(0,T;V^*), \end{aligned}$$
$$\begin{aligned}&p_\gamma \in H^{1}(0,T;H) \cap L^\infty (0,T ; V) \cap L^2 (0,T; W_0) \cap L^\infty (Q), \end{aligned}$$
$$\begin{aligned}&q_\gamma \in L^\infty (0,T ; H) \cap L^2 (0,T; V), \end{aligned}$$
$$\begin{aligned}&r_\gamma \in H^{1}(0,T;H) \cap L^\infty (0,T ; V) \cap L^2 (0,T; W_0)\cap L^\infty (Q), \end{aligned}$$

such that \((p,q,r)=(p_\gamma ,q_\gamma ,r_\gamma )\) satisfies the variational system

$$\begin{aligned}- & {} \mathopen \langle \partial _t(p +\beta q),v\mathclose \rangle + \int _\Omega \nabla q \cdot \nabla v - \chi \int _\Omega \nabla r \cdot \nabla v + \int _\Omega F_{1,\gamma }''({{\overline{\varphi }}}_\gamma )\, q\, v +\int _\Omega F_2''({\overline{\varphi }}_\gamma )\, q\,v \nonumber \\&\qquad + \int _\Omega \mathbbm {h}'({{\overline{\varphi }}}_\gamma )\,{\overline{u}}_{\gamma ,1}\, p\, v -\int _\Omega P'({{\overline{\varphi }}}_\gamma )({{\overline{\sigma }}}_\gamma +\chi (1-{{\overline{\varphi }}}_\gamma )-{{\overline{\mu }}}_\gamma ) (p-r) v \nonumber \\&\qquad + \chi \int _\Omega P({{\overline{\varphi }}}_\gamma )(p-r)v = b_1 \int _\Omega ({{\overline{\varphi }}}_\gamma - \widehat{\varphi }_Q) v, \end{aligned}$$
$$\begin{aligned}- & {} {\alpha } \int _\Omega \partial _tp\, v + \int _\Omega \nabla p \cdot \nabla v -\int _\Omega q\,v +\int _\Omega P({{\overline{\varphi }}}_\gamma )\,(p-r)\,v=0, \end{aligned}$$
$$\begin{aligned}- & {} \int _\Omega \partial _tr \,v +\int _\Omega \nabla r\cdot \nabla v - \chi \int _\Omega q\, v -\int _\Omega P({{\overline{\varphi }}}_\gamma )\,(p-r)\, v=0 , \end{aligned}$$

for every \(v\in V\) and almost every \(t \in (0,T)\), as well as the terminal conditions

$$\begin{aligned} (p+\beta q)(T)=b_2({{\overline{\varphi }}}_\gamma (T)-\widehat{\varphi }_{\Omega }), \quad \alpha p(T)= 0,\quad r(T)=0\quad \hbox {a.e. in }\,\Omega . \end{aligned}$$

Now define

$$\begin{aligned} \mathbf{d}_\gamma (x,t) := {\big (\!-\mathbbm {h}({{\overline{\varphi }}}_\gamma (x,t)) p_\gamma (x,t) , r_\gamma (x,t)\big )}, \quad \hbox {for}\,\,\hbox {a.e.} (x,t) \in Q. \end{aligned}$$

It is then a standard matter (for the details, see the proof of [16, Thm. 5.4]) to use the adjoint variables to simplify the variational inequality (5.3). It then results the following variational inequality:

$$\begin{aligned}&\iint _Q \bigl (\mathbf{d}_\gamma + b_0\, \overline{\mathbf{u}}_\gamma +\overline{\mathbf{u}}_\gamma -\overline{\mathbf{u}}\bigr )\cdot (\mathbf {v} -\overline{\mathbf{u}}_\gamma )\,+\,\kappa \iint _Q\overline{\varvec{\lambda }}_\gamma \cdot (\mathbf {v}-\overline{\mathbf{u}}_\gamma ) \ge 0 \quad \forall \, \mathbf {v} \in {{\mathcal {U}}}_{\mathrm{ad}}, \end{aligned}$$

where \(\overline{\varvec{\lambda }}_\gamma \in \partial g(\overline{\mathbf{u}}_\gamma )\subset L^2(Q)^2\). We now pick any sequence \(\{\gamma _n\}\subset (0,1]\) such that \(\gamma _n\searrow 0\). Then, by Theorem 4.5, we have that (cf. (3.2), (3.3), and (3.5)), as \(n\rightarrow \infty \),

$$\begin{aligned} \overline{\mathbf{u}}_{\gamma _n}\rightarrow \overline{\mathbf{u}}&\quad \hbox {strongly in }\,L^2(Q)^2, \end{aligned}$$
$$\begin{aligned} {\overline{\mu }}_{\gamma _n}\rightarrow {\overline{\mu }}&\quad \hbox {weakly-star in} \,X \hbox {and strongly in} \,C^0([0,T];L^s(\Omega ))\, \hbox {for} \,s\in [1,6), \end{aligned}$$
$$\begin{aligned} {\overline{\varphi }}_{\gamma _n}\rightarrow {\overline{\varphi }}&\quad \hbox {weakly-star in} \,{\widetilde{X}} \hbox {and strongly in }\,C^0({\overline{Q}}), \end{aligned}$$
$$\begin{aligned} {\overline{\sigma }}_{\gamma _n}\rightarrow {\overline{\sigma }}&\quad \hbox {weakly-star in} \,X \hbox {and strongly in} \,C^0([0,T];L^s(\Omega ))\, \hbox {for} \,s\in [1,6), \end{aligned}$$

where the strong convergence in (5.20) and (5.22) follows from [46, Sect. 8, Cor. 4] since \(\,V\,\) is compactly embedded in \(L^s(\Omega )\) for every \(s\in [1,6)\). Moreover, we also have, as \(n\rightarrow \infty ,\)

$$\begin{aligned} {\overline{\varphi }}_{\gamma _n}(T)\rightarrow {\overline{\varphi }}(T) \quad \hbox {strongly in }\,C^0({\overline{\Omega }}), \end{aligned}$$

and, by Lipschitz continuity, that

$$\begin{aligned}&F_2''({\overline{\varphi }}_{\gamma _n})\rightarrow F_2''({\overline{\varphi }}), \quad P({\overline{\varphi }}_{\gamma _n})\rightarrow P({\overline{\varphi }}), \quad P'({\overline{\varphi }}_{\gamma _n})\rightarrow P'(\varphi ), \quad \mathbbm {h}'({\overline{\varphi }}_{\gamma _n})\rightarrow \mathbbm {h}'({\overline{\varphi }}),\nonumber \\&\hbox {all strongly in }\,C^0({\overline{Q}}). \end{aligned}$$

We now derive general bounds for the adjoint variables \((p_\gamma ,q_\gamma ,r_\gamma )\), where we consider the system (5.4)–(5.8) for \((p,q,r)=(p_\gamma ,q_\gamma ,r_\gamma )\). In this process, we denote by \(C_i\), \(i\in {\mathbb {N}}\), positive constants that may depend on the data of the system, but not on \(\gamma \in (0,1]\). Also, we make repeated use of the global (uniform with respect to \(\gamma \in (0,1]\)) estimate (2.18) without further reference. We also note that \(-1\le {{\overline{\varphi }}_\gamma } \le 1\) on \({\overline{Q}}\) for all \({\gamma \in (0,1]}\), so that

$$\begin{aligned}&{\Vert F_2''({\overline{\varphi }}_\gamma )\Vert _{{L^\infty (Q)}}\,+\,\max _{i=0,1}\Vert P^{(i)}({\overline{\varphi }}_\gamma )\Vert _{{L^\infty (Q)}} +\Vert \mathbbm {h}'({\overline{\varphi }}_\gamma )\Vert _{{L^\infty (Q)}} \le \,C_1\quad \forall \,\gamma \in (0,1].} \end{aligned}$$

First estimate: The following estimate is only formal. For a rigorous proof, it would have to be performed on the level of a suitable Faedo–Galerkin scheme for the approximate solution of (5.4)–(5.8). For the sake of brevity, we avoid writing such a scheme explicitly here and argue formally, knowing that this estimate can be made rigorous.

We multiply (5.4) by \(\,q_\gamma \), (5.5) by \(\,-\partial _tp_\gamma \), (5.6) by \(\,\chi ^2r_\gamma \), add the resulting identities, and integrate over \(Q^t:=\Omega \times (t,T)\), where \(t\in [0,T)\). Then, we add to both sides the same term \(\,{\tfrac{1}{2}}\Vert p_\gamma (t)\Vert ^2={\tfrac{1}{2}}\Vert p_\gamma (T)\Vert ^2- {{\iint }_{Q^t}p_\gamma \,\partial _tp_\gamma }\). Integration by parts then leads to the equality

$$\begin{aligned}&\frac{\beta }{2} \Vert q_\gamma (t)\Vert ^2 \,+\iint _{Q^t}|\nabla q_\gamma |^2\,+\, \alpha \iint _{Q^t}|\partial _tp_\gamma |^2 + \frac{1}{2} \mathopen \Vert p_\gamma (t)\mathclose \Vert _V^2 + \frac{\chi ^2}{2} \mathopen \Vert r_\gamma (t)\mathclose \Vert ^2\nonumber \\&\quad + \chi ^2 \iint _{Q^t}|\nabla r_\gamma |^2 +\iint _{Q^t}F_{1,\gamma }''({\overline{\varphi }}_\gamma )\,|q_\gamma |^2\nonumber \\&=\, \frac{\beta }{2} \mathopen \Vert q_\gamma (T)\mathclose \Vert ^2 + {{}\frac{1}{2} \mathopen \Vert p_\gamma (T)\mathclose \Vert _V^2 + \frac{\chi ^2}{2} \mathopen \Vert r_\gamma (T)\mathclose \Vert ^2 {}} \nonumber \\&\quad {} + b_1\iint _{Q^t} ({{\overline{\varphi }}}_\gamma -\widehat{\varphi }_{Q})\,q_\gamma +\chi \iint _{Q^t} \nabla r_\gamma \cdot \nabla q_\gamma - \iint _{Q^t} F_2''({{\overline{\varphi }}}_\gamma )\,|q_\gamma |^2 \nonumber \\&\quad - \iint _{Q^t} \mathbbm {h}'({{\overline{\varphi }}}_\gamma )\,{\overline{u}}_{\gamma ,1} \,p_\gamma \,q_\gamma {+ \iint _{Q^t} P'({{\overline{\varphi }}}_\gamma )({{\overline{\sigma }}}_\gamma +\chi (1-{{\overline{\varphi }}}_\gamma )-{{\overline{\mu }}}_\gamma )\,(p_\gamma -r_\gamma )\,q_\gamma }\nonumber \\&\quad - \chi \iint _{Q^t} P({{\overline{\varphi }}}_\gamma )\,(p_\gamma -r_\gamma )\,q_\gamma +\iint _{Q^t} P({{\overline{\varphi }}}_\gamma )\,(p_\gamma -r_\gamma )\,\partial _tp_\gamma - \iint _{Q^t} p_\gamma \, \partial _tp_\gamma \nonumber \\&\quad + \chi ^3 \iint _{Q^t} q_\gamma \,r_\gamma + \chi ^2 \iint _{Q^t} P({{\overline{\varphi }}}_\gamma )\,(p_\gamma -r_\gamma )\,r_\gamma =: \sum _{i=1}^{{13}}I_i. \end{aligned}$$

Observe that the last term on the left-hand side is nonnegative since \(F_{1,\gamma }''\ge 0\). We estimate the terms on the right-hand side individually. The first three terms are bounded by a constant, due to the terminal conditions (5.8), the assumption (C2), and the fact that \(\,\Vert {\overline{\varphi }}_\gamma \Vert _{{L^\infty (Q)}}\le 1\). Likewise, for the fourth term we get

$$\begin{aligned} |I_4| \le C_2 \iint _{Q^t} (|q_\gamma |^2 +1). \end{aligned}$$

Moreover, invoking (2.18), (5.24) and Young’s inequality, we easily see that

$$\begin{aligned}&{|I_5|+|I_6|+|I_7|+|I_8|+|I_9|+|I_{12}|+|I_{13}|} \nonumber \\&\quad \le {{}\frac{\chi ^2}{2} \iint _{Q^t} |\nabla r_\gamma |^2 + \frac{1}{2} \iint _{Q^t}|\nabla {q_\gamma }|^2{}} + {C_3}\iint _{Q^t}\bigl (|p_\gamma |^2+|q_\gamma |^2+|r_\gamma |^2\bigr ) . \end{aligned}$$

Finally, owing to (5.24) and Young’s inequality,

$$\begin{aligned} {|I_{10}|+|I_{11}|} \le \frac{\alpha }{2}\iint _{Q^t} |\partial _tp_\gamma |^2 +{C_4} \iint _{Q^t}\bigl (|p_\gamma |^2+|r_\gamma |^2\bigr ). \end{aligned}$$

Now, we combine the above estimates and invoke Gronwall’s lemma to infer that, for every \(\gamma \in (0,1],\)

$$\begin{aligned} \Vert p_{\gamma }\Vert _{H^1(0,T; H) \cap L^\infty (0,T; V )}&+ \Vert q_{\gamma }\Vert _{L^ \infty (0,T; H) \cap L^2 (0,T; V)} \nonumber \\+&\quad \Vert r_{\gamma }\Vert _{L^\infty (0,T ; H) \cap L^2(0,T ; V)} \le C_5. \end{aligned}$$

Second estimate: We can now rewrite Eq. (5.6) as a backward-in-time parabolic equation with null terminal condition and source term \(f_r{:} = \chi q_\gamma + P({{\overline{\varphi }}}_\gamma )(p_\gamma -r_\gamma )\), which is uniformly bounded in \(L^\infty (0,T; H)\) due to the above estimate. It is then a standard matter to infer that

$$\begin{aligned} \mathopen \Vert r_\gamma \mathclose \Vert _{H^{1}(0,T;H) \cap L^\infty (0,T; V) \cap L^2 (0,T ; W_0)} \le {C_6} \quad \forall \,\gamma \in (0,1]. \end{aligned}$$

In addition, since \({r_\gamma } (T)=0\in L^\infty (\Omega )\), we can apply the regularity result [41, Thm. 7.1, p. 181] to infer that also

$$\begin{aligned} \mathopen \Vert r_\gamma \mathclose \Vert _{L^\infty (Q)} \le {C_7} \quad \forall \,\gamma \in (0,1]. \end{aligned}$$

Third estimate: From Eq. (5.5) (see also (5.8)) and the parabolic regularity theory, we similarly recover that

$$\begin{aligned} \mathopen \Vert p_\gamma \mathclose \Vert _{L^2 (0,T; W_0){{}\cap L^\infty (Q)}} \le {C_8}\quad \forall \,\gamma \in (0,1]. \end{aligned}$$

Fourth estimate: For the next estimate, we introduce the space

$$\begin{aligned} {{\mathcal {Q}}}=\{v\in H^1(0,T;H)\cap L^2(0,T;V):v(0)=0\}, \end{aligned}$$

which is a closed subspace of \(H^1(0,T;H)\cap L^2(0,T;V)\) and thus a Hilbert space. Obviously, \({{\mathcal {Q}}}\) is continuously embedded in \(C^0([0,T];H)\), and we have the dense and continuous embeddings \({{\mathcal {Q}}}\subset L^2(0,T;H)\subset {{\mathcal {Q}}}^*\), where it is understood that

$$\begin{aligned} \langle v,w\rangle _{{\mathcal {Q}}}\,=\,\int _0^T(v(t),w(t))\,\mathrm{d}t \quad \hbox {for all} \,w\in {{\mathcal {Q}}}\,\hbox { and} \,v\in L^2(0,T;H). \end{aligned}$$

Next, we recall an integration-by-parts formula, which is well known for more regular functions and was proved in the following form in [11, Lem. 4.5]: if \(({{\mathcal {V}}},{{\mathcal {H}}},{\mathcal V^*})\) is a Hilbert triple and

$$\begin{aligned} w\in H^1(0,T;{{\mathcal {H}}})\cap L^2(0,T;{{\mathcal {V}}}) \quad \hbox {and}\quad z\in H^1(0,T;{{\mathcal {V}}}^*)\cap L^2(0,T;{{\mathcal {H}}}), \end{aligned}$$

then the function \(\,t\mapsto (w(t),z(t))_{{\mathcal {H}}}\,\) is absolutely continuous, and for every \(t_1,t_2\in [0,T]\) it holds the formula

$$\begin{aligned} \int _{t_1}^{t_2}\bigl [(\partial _tw(t),z(t))_{{\mathcal {H}}}+\langle \partial _tz(t),w(t)\rangle _{{\mathcal {V}}}\bigr ]\,\mathrm{d}t\,=\, (w(t_2),z(t_2))_{{\mathcal {H}}}-(w(t_1),z(t_1))_{{\mathcal {H}}}, \end{aligned}$$

where \(( \ , \cdot \ , , \ , \cdot \ , ) {\mathcal {H}}\) and \(\langle \ , \cdot \ , , \ , \cdot \ , \rangle {\mathcal {V}}\) denote the inner product in \({{\mathcal {H}}}\) and the dual pairing in \({{\mathcal {V}}}\), respectively.

We apply the above result to the special case when \({{\mathcal {H}}}=H\), \({{\mathcal {V}}}=V\), \(z=p_\gamma +\beta q_\gamma \), and \(w=v\in {{\mathcal {Q}}}\). Then, using the terminal condition (5.8), the fact that \(v(0)=0\) by (5.29), as well as the estimates (5.25) and (2.18), we have that

$$\begin{aligned}&\Big |\int _0^T\langle \partial _t(p_\gamma +\beta q_\gamma )(t),v(t)\rangle \,\mathrm{d}t \, \Big | \,\,\nonumber \\&\quad \le \Big |\iint _Q(p_\gamma {+}\beta q_\gamma )\,\partial _tv\Big |{+}\bigl |((p_\gamma +\beta q_\gamma )(T),v(T))\bigr | \nonumber \\&\quad \le \,\Vert p_\gamma +\beta q_\gamma \Vert _{L^2(Q)}\,\Vert \partial _tv\Vert _{L^2(Q)}\,+\,b_2\, {\Vert {\overline{\varphi }}_\gamma (T)-{\widehat{\varphi }_{\Omega }}\Vert \,\Vert v(T)\Vert } \nonumber \\&\quad \le \,{C_9}\,\Vert v\Vert _{H^1(0,T;H)}\,+\,{C_{10}}\,\Vert v\Vert _{C^0([0,T];H)}\,\le \,{C_{11}}\,\Vert v\Vert _{\mathcal {Q}}, \end{aligned}$$

which means that

$$\begin{aligned} \Vert \partial _t(p_\gamma +\beta q_\gamma )\Vert _{{{\mathcal {Q}}}^*}\,\le \,{C_{12}} \quad \forall \,\gamma \in (0,1]. \end{aligned}$$

At this point, we can conclude from the estimates (5.24), (5.25)–(5.28), (5.33), using comparison in (5.13), that the linear mapping

$$\begin{aligned} \Lambda _\gamma :L^2(Q)\ni v\mapsto \Lambda _\gamma (v)={\iint _Q F_{1,\gamma }''({\overline{\varphi }}_\gamma )\,q_\gamma \,v \in {\mathbb {R}}} \end{aligned}$$


$$\begin{aligned} \Vert \Lambda _\gamma \Vert _{{\mathcal Q}^*}\,\le \,{C_{13}}\quad \forall \,\gamma \in (0,1]. \end{aligned}$$

We now return to the sequence \(\gamma _n\searrow 0\) introduced above and recall the convergence properties (5.19)–(5.23). Owing to the global estimates (5.25)–(5.28), (5.35) and possibly taking another subsequence, we may without loss of generality assume that there are limit points pqr,  and \(\Lambda \) such that, as \(n\rightarrow \infty \),

$$\begin{aligned} p_{\gamma _n}\rightarrow p&\quad \hbox {weakly-star in} \,X, \hbox {and strongly in} \,C^0([0,T];L^s(\Omega ))\, \hbox {for} \, s\in [1,6), \end{aligned}$$
$$\begin{aligned} q_{\gamma _n}\rightarrow q&\quad \hbox {weakly-star in }\, L^\infty (0,T;H)\cap L^2(0,T;V), \end{aligned}$$
$$\begin{aligned} r_{\gamma _n}\rightarrow r&\quad \hbox {weakly-star in} \,X, \hbox {and strongly in} \,C^0([0,T];L^s(\Omega ))\, \hbox {for}\,\, s\in [1,6),\ \end{aligned}$$
$$\begin{aligned} \Lambda _{\gamma _n}\rightarrow \Lambda&\quad \text{ weakly } \text{ in } \,{\mathcal Q}^*, \end{aligned}$$

where X is defined in (2.21) and the strong convergence in (5.36) and (5.38) again follows from [46, Sect. 8, Cor. 4].

We now perform a passage to the limit as \(n\rightarrow \infty \) in the adjoint system (5.13)–(5.16), written for \(\gamma =\gamma _n\) and \((p,q,r)=(p_{\gamma _n},q_{\gamma _n},r_{\gamma _n})\), for \(n\in {\mathbb {N}}\). From the convergence results stated above, it is obvious that, as \(n\rightarrow \infty \),

$$\begin{aligned} F_2''({\overline{\varphi }}_{\gamma _n})\,q_{\gamma _n}&\rightarrow F_2''({\overline{\varphi }})\,q&\hbox {weakly in }\,L^2(Q), \end{aligned}$$
$$\begin{aligned} P({\overline{\varphi }}_{\gamma _n})\,(p_{\gamma _n}-r_{\gamma _n})&\rightarrow P({\overline{\varphi }})\,(p-r)&\hbox {weakly in }\,L^2(Q), \end{aligned}$$
$$\begin{aligned} b_1({\overline{\varphi }}_{\gamma _n}-\widehat{\varphi }_{Q})&\rightarrow b_1({\overline{\varphi }}-\widehat{\varphi }_{Q})&\hbox {strongly in }\,L^2(Q), \end{aligned}$$
$$\begin{aligned} b_2({\overline{\varphi }}_{\gamma _n}(T)-\widehat{\varphi }_{\Omega })&\rightarrow b_2({\overline{\varphi }}(T)-\widehat{\varphi }_{\Omega })&\hbox {{strongly} in }\,L^{2}(\Omega ). \end{aligned}$$

A bit less obvious is the fact that also

$$\begin{aligned} \mathbbm {h}'({\overline{\varphi }}_{\gamma _n})\,{\overline{u}}_{\gamma _n,1}\,p_{\gamma _n}\rightarrow \mathbbm {h}'({\overline{\varphi }})\,{\overline{u}}_1\,p\quad \hbox {weakly in }\,L^2(Q), \end{aligned}$$


$$\begin{aligned}&P'({\overline{\varphi }}_{\gamma _n})\,({\overline{\sigma }}_{\gamma _n}+\chi (1-{\overline{\varphi }}_{\gamma _n})-{\overline{\mu }}_{\gamma _n})(p_{\gamma _n}-r_{\gamma _n})\rightarrow \nonumber \\&P'({\overline{\varphi }})\,({\overline{\sigma }}+\chi (1-{\overline{\varphi }})-{\overline{\mu }})(p-r)\nonumber \\&\hbox {weakly in }\,L^2(Q). \end{aligned}$$

\(\square \)

We only show the validity of (5.44), since the proof of (5.45) is similar and even simpler. To this end, recall the strong convergence properties (5.19) and (5.23), as well as the fact \(p_{\gamma _n}\rightarrow p\) strongly in \(C^0([0,T];H)\), in particular. It is then easily verified that for every \(z\in {L^\infty (Q)}\) it holds

$$\begin{aligned} \lim _{n\rightarrow \infty } \iint _Q \mathbbm {h}'({\overline{\varphi }}_{\gamma _n})\,{\overline{u}}_{\gamma _n,1}\,p_{\gamma _n}\,z\,=\iint _Q \mathbbm {h}'({\overline{\varphi }})\,{\overline{u}}_1 p \,z , \end{aligned}$$

that is, we have weak convergence in \(L^1(Q)\). On the other hand, \(\,\{\mathbbm {h}'({\overline{\varphi }}_{\gamma _n})\,{\overline{u}}_{\gamma _n,1}\,p_{\gamma _n}\}\,\) is bounded in \(L^2(Q)\), whence (5.44) follows.

Remark 5.1

With reference to Remark 3.3, let us suggest the reader to read it again; we comment that (5.20)–(5.22) should be replaced by

$$\begin{aligned} {\overline{\mu }}_{\gamma _n}\rightarrow {\overline{\mu }}&\quad \hbox {weakly-star in} \,X_w, \hbox {and strongly in} \,L^2(0,T;L^s(\Omega ))\, \hbox {for} \,s\in [1,6), \end{aligned}$$
$$\begin{aligned} {\overline{\varphi }}_{\gamma _n}\rightarrow {\overline{\varphi }}&\quad \hbox {weakly-star in} \,{\widetilde{X}}_w, \hbox {and strongly in} \,C^{0}([0,T];L^s(\Omega ))\, \hbox {for} \,s\in [1,6), \end{aligned}$$
$$\begin{aligned} {\overline{\sigma }}_{\gamma _n}\rightarrow {\overline{\sigma }}&\quad \hbox {weakly-star in} \,X_w, \hbox {and strongly in} \,L^2(0,T;L^s(\Omega ))\, \hbox {for} \,s\in [1,6). \end{aligned}$$

By Lipschitz continuity, it then turns out that

$$\begin{aligned}&F_2''({\overline{\varphi }}_{\gamma _n})\rightarrow F_2''({\overline{\varphi }}), \quad P({\overline{\varphi }}_{\gamma _n})\rightarrow P({\overline{\varphi }}), \quad P'({\overline{\varphi }}_{\gamma _n})\rightarrow P'(\varphi ), \quad \mathbbm {h}'({\overline{\varphi }}_{\gamma _n})\rightarrow \mathbbm {h}'({\overline{\varphi }}),\nonumber \\&\hbox {all strongly in} \,C^{0}([0,T];L^s(\Omega ))\, \hbox {for} \,s\in [1,6). \end{aligned}$$

Then, one may directly check that (5.40)–(5.44) still hold, and about (5.45) we have that, for instance, \( P'({\overline{\varphi }}_{\gamma _n})\) converges strongly in \(\,C^{0}([0,T];L^4(\Omega ))\), \(({\overline{\sigma }}_{\gamma _n}+\chi (1-{\overline{\varphi }}_{\gamma _n}){-{\overline{\mu }}_{\gamma _n}})\) converges strongly in \(\,L^2(0,T;L^4(\Omega ))\), \((p_{\gamma _n}-r_{\gamma _n})\) converges weakly-star in \(\,L^\infty (0,T;L^4(\Omega ))\), and consequently,

$$\begin{aligned}&P'({\overline{\varphi }}_{\gamma _n})\,({\overline{\sigma }}_{\gamma _n}+\chi (1-{\overline{\varphi }}_{\gamma _n})-{\overline{\mu }}_{\gamma _n})(p_{\gamma _n}-r_{\gamma _n})\rightarrow \nonumber \\&P'({\overline{\varphi }})\,({\overline{\sigma }}+\chi (1-{\overline{\varphi }})-{\overline{\mu }})(p-r)\nonumber \\&\hbox {weakly in }\,L^2(0,T;L^{4/3}(\Omega )), \end{aligned}$$

\(\square \)

with \(L^2(0,T;L^{4/3}(\Omega ))\) continuously embedded in \(L^2 (0,T ; V^*)\). Then, the limit procedure in the sequel can be carried out also in the weaker setting.

Now, we apply the integration-by-parts formula (5.31) to see that for every \(v\in {{\mathcal {Q}}}\) it holds that

$$\begin{aligned}&\lim _{n\rightarrow \infty }\int _0^T\langle -\partial _t(p_{\gamma _n}+\beta q_{\gamma _n})(t),v(t)\rangle \,\mathrm{d}t\nonumber \\&\quad =\,\lim _{n\rightarrow \infty }\Big (\iint _Q(p_{\gamma _n}+\beta q_{\gamma _n})\,\partial _tv- b_2\int _\Omega ({\overline{\varphi }}_{\gamma _n}(T)-{\widehat{\varphi }}_\Omega )\,v(T)\Big )\nonumber \\&\quad =\,\iint _Q(p+\beta q)\,\partial _tv\,-\,b_2\int _\Omega ({\overline{\varphi }}(T)-\widehat{\varphi }_{\Omega })\,v(T) . \end{aligned}$$

At this point, we may pass to the limit as \(n\rightarrow \infty \) to arrive at the following limit system:

$$\begin{aligned} \langle \Lambda ,v\rangle _{{\mathcal Q}}&=\,-\iint _Q(p+\beta q)\,\partial _tv\,+\,b_2\int _\Omega ({\overline{\varphi }}(T)-{\widehat{\varphi }_{\Omega }})\,v(T) - \iint _Q \nabla q \cdot \nabla v \nonumber \\&\quad + \chi \iint _Q \nabla r \cdot \nabla v-\iint _Q F_2''({\overline{\varphi }})\, q\, v\,- \iint _Q \mathbbm {h}'({{\overline{\varphi }}})\,{\overline{u}}_{1}\, p\, v \nonumber \\&\quad +\iint _Q P'({{\overline{\varphi }}})({{\overline{\sigma }}}+\chi (1-{{\overline{\varphi }}})-{{\overline{\mu }}}) (p-r) v - \chi \iint _Q P({{\overline{\varphi }}})(p-r)v \nonumber \\&\quad + b_1 \iint _Q ({{\overline{\varphi }}}- {\widehat{\varphi }_{Q}}) v \qquad \hbox {for all} \,v\in {{\mathcal {Q}}}, \end{aligned}$$
$$\begin{aligned}&\quad -{\alpha } \int _\Omega \partial _tp\,v + \int _\Omega \nabla p \cdot \nabla v -\int _\Omega q\,v +\int _\Omega P({{\overline{\varphi }}})\,(p-r)\,v=0 \nonumber \\&\quad \qquad \hbox {for all} \,v\in V, \hbox {almost everywhere in}\,\, (0,T), \end{aligned}$$
$$\begin{aligned}&\quad - \int _\Omega \partial _tr\,v +\int _\Omega \nabla r\cdot \nabla v - \chi \int _\Omega q\, v -\int _\Omega P({{\overline{\varphi }}})\,(p-r)\, v=0 \nonumber \\&\quad \qquad \hbox {for all} \,v\in V, \hbox {almost everywhere in}\,\, (0,T), \end{aligned}$$
$$\begin{aligned}&\alpha p(T)= 0,\quad r(T)=0\quad \hbox {a.e. in }\,\Omega . \end{aligned}$$

Finally, we consider the variational inequality (5.18) for \(\gamma =\gamma _n\), \(n\in {\mathbb {N}}\). First observe that the above convergence results certainly imply that

$$\begin{aligned} \mathbf {d}_{\gamma _n}+b_0\overline{\mathbf{u}}_{\gamma _n}+\overline{\mathbf{u}}_{\gamma _n}-\overline{\mathbf{u}}\,\rightarrow \,\mathbf {d}+b_0\overline{\mathbf{u}}\quad \hbox {strongly in }\,L^2(Q)^2, \end{aligned}$$

where \(\mathbf {d} {{}: ={}} (-\mathbbm {h}({\overline{\varphi }})p,r)\) a.e. in Q.

At this point, we recall that the subdifferential \(\,\partial g\,\) is defined on the entire space \(L^2(Q)^2\) and maximal monotone, and thus a locally bounded operator. Owing to (5.19), the sequence \(\{{\overline{\varvec{\lambda }}}_{\gamma _n}\}\) introduced (5.3) is therefore bounded in \(L^2(Q)^2\), and we may without loss of generality assume that there is some \({\overline{\varvec{\lambda }}}\in L^2(Q)^2\) such that \(\,{\overline{\varvec{\lambda }}}_{\gamma _n}\rightarrow {\overline{\varvec{\lambda }}}\,\) weakly in \(L^2(Q)^2\) as \(n\rightarrow \infty \). A standard argument for maximal monotone operators then yields that \({\overline{\varvec{\lambda }}}\in \partial g(\overline{\mathbf{u}})\), and passage to the limit as \(n\rightarrow \infty \) in (5.18) then shows that the following variational inequality is satisfied for the limiting variables:

$$\begin{aligned}&\iint _Q \bigl (\mathbf{d}\,+\, b_0\, \overline{\mathbf{u}}\,+\,\kappa \,\overline{\varvec{\lambda }})\cdot (\mathbf {v}-\overline{\mathbf{u}}) \ge 0 \quad \forall \, \mathbf {v} \in {{\mathcal {U}}}_{\mathrm{ad}}. \end{aligned}$$

Summarizing the above considerations, we have proved the following first-order necessary optimality conditions for the optimal control problem (\({\mathcal {CP}}_0\)).

Theorem 5.2

Suppose that the conditions (A1)–(A4), (C1)–(C3), and (2.12)–(2.14) are fulfilled, and let \(\overline{\mathbf{u}}\in {{\mathcal {U}}}_{\mathrm{ad}}\) be a minimizer of the optimal control problem (\({\mathcal {CP}}_0\)) with associate state \(({\overline{\mu }},{\overline{\varphi }},{\overline{\xi }}, {\overline{\sigma }})={{\mathcal {S}}}_0(\overline{\mathbf{u}})\). Then there exist \(p,q,r,{\overline{\varvec{\lambda }}},\) and \(\Lambda \) such that the following holds true:

(i)      \(p,r\in X\), \(q\in L^\infty (0,T;H)\cap L^2(0,T;V)\), \({\overline{\varvec{\lambda }}}\in {\partial }g(\overline{\mathbf{u}})\), \(\Lambda \in {{\mathcal {Q}}}^*\).

(ii)    The adjoint system (5.49)–(5.52) and the variational inequality (5.54) are satisfied where \(\,\mathbf{d}=(-\mathbbm {h}({\overline{\varphi }})p,r)\).

Remark 5.3

(i) Observe that the adjoint state (pqr) and the Lagrange multiplier \(\Lambda \) are not unique. However, all possible choices satisfy (5.54).

(ii) We have, for every \(n\in {\mathbb {N}}\), the complementarity slackness condition (cf. (5.34))

$$\begin{aligned} {\Lambda _{\gamma _n}(q_{\gamma _n})}=\iint _Q F_{1,\gamma _n}''({\overline{\varphi }}_{\gamma _n})\,|q_{\gamma _n}|^2\,\ge \,0. \end{aligned}$$

Unfortunately, the weak convergence properties of \(\{q_{\gamma _n}\}\) do not permit a passage to the limit in this inequality to derive a corresponding result for (\({\mathcal {CP}}_0\)).

6 Sparsity of Optimal Controls

In this section, we discuss the sparsity of optimal controls, that is, the possibility that the optimal controls will vanish in some proper subset of \(\,Q\); the form of this subset depends on the particular choice of the convex function g in the cost functional, while its size depends on the sparsity parameter \(\kappa \) (see (1.1)). We again generally assume that the conditions (A1)–(A4), (C1)–(C3), and (2.12)–(2.14) are satisfied. Moreover, we assume that \(\overline{\mathbf{u}}=(\overline{u}_1,\overline{u}_2)\in {{\mathcal {U}}}_{\mathrm{ad}}\) is a minimizer of (\({\mathcal {CP}}_0\)) with associated state \(({\overline{\mu }},{\overline{\varphi }},{\overline{\xi }},{\overline{\sigma }}){= {{\mathcal {S}}}_0({\overline{\mathbf{u}}})}\) and adjoint state (pqr). Then, the first-order necessary optimality condition (5.54) is satisfied. Since we plan to discuss sparsity properties, we make a further assumption:


  The sparsity parameter \(\kappa \) is positive.

The sparsity properties will be deduced from the variational inequality (5.54) and the particular form of the subdifferential \(\partial g\). In the following argumentation, we closely follow the lines of [47, Sect. 4]; since a detailed discussion is given there, we can afford to be brief here.

We begin our analysis by introducing the convex functionals g we are interested in.

Directional sparsity with respect to time: Here we use \(\,g_T: L^1(0,T;L^2(\Omega )) \rightarrow {\mathbb {R}}\),

$$\begin{aligned} g_T(u) = \Vert u\Vert _{L^1(0,T;L^{2}(\Omega ))}= \int _0^T \Vert u(\cdot ,t)\Vert _{L^{2}(\Omega )}\,\mathrm{d}t, \end{aligned}$$

with the subdifferential (in \(L^2(Q)\), cf. (C3))

$$\begin{aligned} \partial g_T(u) = \left\{ \lambda \in {L^2(Q)} : \left\{ \begin{array}{ll} \Vert \lambda (\cdot ,t)\Vert _{L^{2}(\Omega )} \le 1 &{}\,\hbox { if } \, {\Vert u(\cdot ,t)\Vert _{L^{2}(\Omega )}} = 0\\ \lambda (\cdot ,t)\,= \displaystyle \frac{u(\cdot ,t)}{ \Vert u(\cdot ,t)\Vert _{L^{2}(\Omega )}}&{}\,\hbox { if } \, {\Vert u(\cdot ,t)\Vert _{L^{2}(\Omega )}} \not = 0 \end{array} \right. \right\} , \end{aligned}$$

where the properties above are satisfied for almost every \(t \in (0,T)\).

Directional sparsity with respect to space: In this case we use \(g_\Omega : L^1(\Omega ;L^2(0,T)) \rightarrow {\mathbb {R}}\),

$$\begin{aligned} g_\Omega (u) = \Vert u\Vert _{{L^1(\Omega ;L^2(0,T))}}= \int _\Omega \Vert u(x,\cdot )\Vert _{L^2(0,T)}\,\mathrm{d}x, \end{aligned}$$

with the subdifferential

$$\begin{aligned} \partial g_\Omega (u) = \left\{ \lambda \in {L^2(Q)} : \left\{ \begin{array}{ll} \Vert \lambda (x,\cdot )\Vert _{L^2(0,T)} \le 1 &{}\,\hbox { if } \, {\Vert u(x,\cdot )\Vert _{L^2(0,T)}} = 0\\ \lambda (x,\cdot )= \displaystyle \frac{u(x,\cdot )}{\Vert u(x,\cdot )\Vert _{L^2(0,T)}}&{}\,\hbox { if } \, {\Vert u(x,\cdot )\Vert _{L^2(0,T)}} \not = 0 \end{array} \right. \right\} , \end{aligned}$$

where the above properties have to be fulfilled for almost every \(x \in \Omega \).

Spatio-temporal sparsity: Here we use \(g_Q: L^1(Q) \rightarrow {\mathbb {R}}\),

$$\begin{aligned} g_Q(u) = \Vert u\Vert _{L^1(Q)}=\iint _Q |u(x,t)|\, \mathrm{d}x\,\mathrm{d}t, \end{aligned}$$

with the subdifferential

$$\begin{aligned} \partial g_Q(u) = \left\{ \lambda \in {L^2(Q)}:\, \lambda (x,t) \left\{ \begin{array}{ll} =1 &{} \hbox { if } \, u(x,t) > 0\\ \in [-1,1]&{} \hbox { if } \, u(x,t) = 0\\ = -1 &{} \hbox { if } \, u(x,t) < 0\\ \end{array} \right. \hbox {, \ a.e. } {(x,t) \in Q} \right\} . \end{aligned}$$

Remark 6.1

Observe that in any of the cases \(g\in \{g_T,g_\Omega ,g_Q\}\) the subdifferential operates on the entire space \(L^2(Q)\). Moreover, in the third example it turns out that whenever \(\lambda \in \partial g_Q (u)\), then \(|\lambda |\le 1\) almost everywhere in Q.

In the following, we concentrate on directional sparsity in time, since this seems to be the most important for medical applications; indeed, if an application to medication is considered, directional sparsity in time will allow to stop the administration of drugs in certain intervals of time. The subsequent analysis is based on the following auxiliary sparsity result (see [1, 38, 47]) for the case of scalar controls:

Lemma 6.2

Assume that

$$\begin{aligned} C = \{v \in L^\infty (Q): {\underline{w}} \le v(x,t) \le {{\hat{w}}} \,\,\hbox { for a.e.} (x,t)\,\, in\,\, Q\}, \end{aligned}$$

with real numbers \({\underline{w}}< 0 < {\hat{w}}\), and let a function \(d \in L^2(Q)\) be given. Moreover, assume that \(\,u \in C\,\) is a solution to the variational inequality

$$\begin{aligned} \iint _Q (d + \kappa \lambda + b_0 u)(v - u) \ge 0 \quad \forall \, v \in C, \end{aligned}$$

with some \(\lambda \in \partial g_T(u)\). Then, for almost every \(t\in (0,T)\),

$$\begin{aligned} {\Vert u(\cdot ,t)\Vert _{L^2(\Omega )}{}}=0 \quad \Longleftrightarrow \quad \Vert d(\cdot ,t)\Vert _{L^2(\Omega )} \le \kappa , \end{aligned}$$

as well as

$$\begin{aligned} \lambda (\cdot ,t) \left\{ \begin{array}{lcl} \in B(0,1)&{}\,\hbox { if } \, \Vert u(\cdot ,t) \Vert _{L^2(\Omega )} = 0\\ = \displaystyle \frac{u(\cdot ,t)}{\Vert u(\cdot ,t) \Vert _{L^2(\Omega )}}&{}\,\hbox { if } \, \Vert u(\cdot ,t) \Vert _{L^2(\Omega )} \not = 0 \end{array} \right. , \end{aligned}$$

where \(B(0,1) = \{ v \in L^2(\Omega ): \Vert v\Vert _{L^2(\Omega )}\le 1\}.\)

We now apply Lemma 6.2 to the optimal control problem (\({\mathcal {CP}}_0\)) for which the variational inequality (5.54) holds true. To this end, we use the convex continuous functional

$$\begin{aligned} g(\mathbf{u}) = g(u_1,u_2) : = g_T(u_1) + g_T(u_2) = g_T(I_1\mathbf{u}) + g_T(I_2\mathbf{u}), \end{aligned}$$

where \(I_i\) denotes the linear and continuous projection mapping \(I_i: \mathbf{u}=(u_1,u_2) \mapsto u_i\), \(i = 1,2\), from \(L^1(0,T;L^2(\Omega ))^2\) to \(L^1(0,T;L^2(\Omega ))\). Since the convex functional \(g_T\) is continuous on the whole space \(L^1(0,T;L^2(\Omega ))\), we obtain from the well-known sum and chain rules for subdifferentials that

$$\begin{aligned} \partial g(\mathbf{u}) = I_1^*\, \partial g_T(I_1\mathbf{u}) + I_2^*\, \partial g_T(I_2\mathbf{u}) = (I,0)^\top \partial g_T(u_1) + (0,I)^\top \partial g_T(u_2), \end{aligned}$$

with the identity mapping \(I \in \mathcal {L}(L^1(0,T;L^2(\Omega )))\). Therefore, we have

$$\begin{aligned} \partial g(\mathbf{u}) = \{ (\lambda _1,\lambda _2) \in {L^2(Q)^2}: \lambda _i \in \partial g_T(u_i), \, i = 1,2\}. \end{aligned}$$

Now observe that the variational inequality (5.54) is equivalent to two independent variational inequalities for \(\overline{u}_1\) and \(\overline{u}_2\) that have to hold simultaneously, namely,

$$\begin{aligned} \iint _Q \left( - \mathbbm {h}({{\overline{\varphi }}}) \,p+ \kappa \overline{\lambda }_1 +b_0\, {\overline{u}}_1\right) \left( v - \overline{u}_1\right)\ge & {} 0 \quad \forall \, v \in C_1, \end{aligned}$$
$$\begin{aligned} \iint _Q \left( r + \kappa {\overline{\lambda }}_2 +b_0\,\overline{u}_2\right) \left( v - \overline{u}_2\right)\ge & {} 0 \quad \forall \, v \in C_2, \end{aligned}$$

with \({\overline{\varvec{\lambda }}}=({\overline{\lambda }}_1,{\overline{\lambda }}_2)\), where the sets \(\,C_i\), \(i = 1,2\), are defined by

$$\begin{aligned} C_i = \{v \in L^\infty (Q): {\underline{u}}_i \le v(x,t) \le {\hat{u}}_i \hbox { for a.a. } (x,t) \in Q\}, \end{aligned}$$

and where \({\overline{\lambda }}_i \), \(i = 1,2\), obey for almost every \(t\in (0,T)\) the conditions

$$\begin{aligned} {\overline{\lambda }}_i(\cdot ,t) \left\{ \begin{array}{lcl} \in B(0,1)&{}\,\hbox { if } \, \Vert {\overline{u}}_i(\cdot ,t) \Vert _{L^2(\Omega )} = 0\\ = \displaystyle \frac{{\overline{u}}_i(\cdot ,t)}{\Vert \overline{u}_i(\cdot ,t) \Vert _{L^2(\Omega )}}&{}\,\hbox { if } \, \Vert \overline{u}_i(\cdot ,t) \Vert _{L^2(\Omega )} \not = 0 \end{array} \right. . \end{aligned}$$

Applying Lemma 6.2 to each of the variational inequalities (6.12) and (6.13) separately, we arrive at the following result:

Theorem 6.3

(Directional sparsity in time for (\({\mathcal {CP}}_0\)))

Suppose that the general assumptions (A1)–(A4), (C1)–(C4) and (2.12)–(2.14) are fulfilled, and assume in addition that the constants \(\underline{u}_i,\widehat{u}_i \) in (A4) satisfy  \(\underline{u}_i<0<{\widehat{u}}_i\), for  \(i=1,2\). Let \({\overline{\mathbf{u}}} = ({\overline{u}}_1,{\overline{u}}_2)\) be an optimal control of the problem (\({\mathcal {CP}}_0\)) with sparsity functional \(\,g\,\) defined by (6.11), and with associated state \(({\overline{\mu }},{\overline{\varphi }},{\overline{\xi }},{\overline{\sigma }})={{\mathcal {S}}}_0(\overline{\mathbf{u}})\) and adjoint state (pqr) having the properties stated in Theorem 5.2. Then there are functions \({\overline{\lambda }}_i\), \(i=1,2,\) that satisfy (6.14) and (6.12)–(6.13). In addition, for almost every \(t\in (0,T)\), we have that

$$\begin{aligned} \Vert {\overline{u}}_1(\cdot ,t)\Vert _{L^2(\Omega )} = 0 \quad\Longleftrightarrow & {} \quad \Vert \mathbbm {h}({\overline{\varphi }}(\cdot ,t))\, p(\cdot ,t)\Vert _{L^2(\Omega )} \le \kappa , \end{aligned}$$
$$\begin{aligned} \Vert {\overline{u}}_2(\cdot ,t)\Vert _{L^2(\Omega )} = 0 \quad\Longleftrightarrow & {} \quad \Vert r(\cdot ,t)\Vert _{L^2(\Omega )} \le \kappa . \end{aligned}$$

Moreover, if  (pqr) and \({\overline{\lambda }}_1, \overline{\lambda }_2\) are given, then the optimal controls \({\overline{u}}_1\), \({\overline{u}}_2\) are for almost every \((x,t)\in Q\) obtained from the pointwise formulae

$$\begin{aligned} {\overline{u}}_1(x,t)= & {} \max \left\{ {\underline{u}}_1, \,\min \left\{ \widehat{u}_1, -{b_0}^{-1}\,\left( -\mathbbm {h}({\overline{\varphi }}(x,t))\,p(x,t)+ \kappa {\overline{\lambda }}_1(x,t)\right) \right\} \,\right\} ,\\ {\overline{u}}_2(x,t)= & {} \max \left\{ {\underline{u}}_2 ,\, \min \left\{ {\widehat{u}}_2 , -{b_0}^{-1} \left( r(x,t) + \kappa {\overline{\lambda }}_2(x,t)\right) \right\} \,\right\} . \end{aligned}$$

Remark 6.4

By virtue of (6.15) and (6.16), optimal controls may vanish on \(\Omega \) for some time intervals. Since the functions \(t \mapsto \Vert \mathbbm {h}({{\overline{\varphi }}}(\cdot ,t)) \,p(\cdot ,t)\Vert _{L^2(\Omega )}\) and \(t \mapsto \Vert r(\cdot ,t)\Vert _{L^2(\Omega )}\) are continuous on [0, T], it is clear that this is the case at least in all open subintervals where these functions are strictly smaller than \(\kappa \). We also note that one expects the support of optimal controls to shrink with increasing sparsity parameter \(\kappa \), which can hardly be quantified. However, it would be useful to confirm that optimal controls vanish for all sufficiently large values of \(\kappa \). Unfortunately, while such a result can be shown for the differentiable approximating problems (\({\mathcal {CP}}_\gamma \)) and (\(\widetilde{{\mathcal {CP}}}_\gamma \)) (by using an argumentation as in the proof of the corresponding [47, Thm. 4.5]), we are unfortunately unable to recover the necessary uniform bounds for p and r from the adjoint state system (5.49)–(5.52).

Remark 6.5

It is worth mentioning that, concerning medical applications, as the controls have the meaning of medications or nutrient supplies, it does not seem to be meaningful to allow for negative controls \(u_1\) and \(u_2\), so that the lower bounds \(\underline{u}_i\), \(i=1,2\) may appear rather odd. However, as commented in [47, Remark 4.4, p. 23], it is possible to introduce a suitable transformation on the controls such that the assumption of a negative lower bound can be fulfilled by the reformulated problem. This change also brings a different notion of sparsity. In that scenario, sparsity reflects a quantified deviation of the control \(u_i\), for \(i=1,2\), from a constant control value \(u_i = c_i\), instead of from zero, for some fixed constants \(c_i\), \(i=1,2\), which are connected to the lower and upper bounds \(\underline{u}_i,\widehat{u_i}\), \(i=1,2\). This approach may be of interest for medical therapies in which a certain medication is maintained constant and permanently administered, while another one should be supplied only occasionally and as rarely as possible.

We conclude this section by briefly sketching the results for the other types of sparsity that are obtained if g is given by \(g_\Omega \,\) or \(g_Q\), respectively. In this respect, we refer to [47, Sect. 4.3].

Spatial sparsity: With the functional \(g({\mathbf{u}}) = g_\Omega (u_1) + g_\Omega (u_2)\), we may have regions in \(\Omega \) where the optimal controls vanish for almost every \(t\in (0,T)\). This is established by simply interchanging the roles of t and x. For instance, instead of the equivalences (6.15), (6.16), one obtains for almost every \(x \in \Omega \) that

$$\begin{aligned} \Vert {\overline{u}}_1(x,\cdot )\Vert _{L^2(0,T)} = 0 \quad\Longleftrightarrow & {} \quad \Vert \mathbbm {h}({{\overline{\varphi }}}(x,\cdot ))p(x,\cdot )\Vert _{L^2(0,T)} \le \kappa , \\ \Vert {\overline{u}}_2(x,\cdot )\Vert _{L^2(0,T)} = 0 \quad\Longleftrightarrow & {} \quad \Vert r(x,\cdot )\Vert _{L^2(0,T)} \le \kappa . \end{aligned}$$

Spatio-temporal sparsity: If g is defined from \(g_Q\) by \(g({{\mathbf {u}}}) = g_Q(u_1) + g_Q(u_2)\), then the equivalence relations

$$\begin{aligned} {\overline{u}}_1(x,t) = 0 \quad\Longleftrightarrow & {} \quad |\mathbbm {h}({{\overline{\varphi }}}(x,t))\,p(x,t)| \le \kappa , \\ {\overline{u}}_2(x,t) = 0 \quad\Longleftrightarrow & {} \quad |r(x,t)| \le \kappa , \end{aligned}$$

can be deduced for almost every \(\,(x,t) \in Q\). Therefore, the optimal controls may vanish in certain spatio-temporal subsets of \(\,Q\).

7 Conclusions

A distributed optimal control problem for a tumor growth model of Cahn–Hilliard type with double obstacle potential is investigated. The controls \(u_1\) and \(u_2\) enter the system as external medication or nutrient supply. To handle the nondifferentiability of the double obstacle potential F, which forces us to formulate Eq. (1.3) as a variational inequality, we employ a “deep quench” approach. Namely, the optimal control problem is seen as the limit of a suitable family of approximating control problems with logarithmic nonlinearities for which the existence of optimal controls and the necessary conditions for optimality are known, and the corresponding results are recovered by passing to the limit. Up to the authors’ knowledge, this is the first time that this approach has been applied to cost functionals containing nondifferentiable terms like the \(L^1\)-norm that lead to sparsity effects. The rigorous asymptotic analysis is discussed in detail obtaining, as a consequence, the existence of optimal controls as well as first-order necessary conditions for optimality. The drawback of this technique is that the adjoint system may be under-determined (cf. Theorem 5.2).