1 Introduction

Consider the following problem

$$\begin{aligned} {\left\{ \begin{array}{ll} dv(t,\xi ) = \left( \varDelta \gamma (v(t,\xi )) - I^{ion} (v(t,\xi )) - f(\xi ) v(t,\xi ) + F(t,\xi )\right) dt + v(t,\xi ) d W(t) \;, \xi \in {\mathcal {O}}\\ v(0,\xi ) =v_0(\xi ) \;, \\ \gamma (v(t,\xi )) = 0 \;, \text {on} \; (0,T) \times \partial {\mathcal {O}} \;. \end{array}\right. } \end{aligned}$$
(1)

\(\gamma : {\mathbb {R}} \rightarrow {\mathbb {R}}\) being a monotone, increasing continuous function, \(v=v(t, \xi )\) represents the transmembrane electrical potential, \({\mathcal {O}}\subset {\mathbb {R}}^d\), \(d=2,3,\) is a bounded and open set with smooth boundary \(\partial {\mathcal {O}}\). We indicate with \(\varDelta _\xi \) the Laplacian operator with respect to the spatial variable \(\xi \), while \(\varepsilon \) and \(\delta \) are positive constants representing phenomenological coefficients, \(f(\xi )\) is a given external forcing term, while \(I^{ion}\) is the Ionic current and, according with the FitzHugh–Nagumo model, it equals \(I^{ion}(v)= v(v-a)(v-1)\), \(v_0\), \(w_0 \, \in L^2({\mathcal {O}})\), namely it represents a cubic non-linearity. Also F is a bounded term needed to treat the general controlled equation in next section.

Equation (1) with linear diffusion, i.e. \(\gamma (x)=x\), is the well-known FitzHugh–Nagumo (FHN) equation. FHN equation is a reaction–diffusion equation, first introduced by Hodgkin and Huxley in [32] and then simplified in [31, 35]. The model has been proposed to provide a rigorous, yet simplified, analysis of electrical impulses dynamics along a nerve axon, see, e.g., [38], where the propagation of the transmembrane potential on the nerve axon is represented by a cubic nonlinear reaction term, possibly perturbed by a noisy one, see, e.g., [12, 22, 38, 41].

The random perturbation represents the effect of noisy input currents within neurons, their source being the random opening/closing actions of ion channels, see, e.g., [41]. Moreover, in two-dimensional and three-dimensional settings, Eq. (1) plays also a relevant role in statistical mechanics, under the name of Ginzburg–Landau equation, as well as concerning phase transition models of Ginzburg–Landau type, see, e.g., [27].

The general case where \(\gamma \) is a monotone function corresponds to an anomalous–diffusive FitzHugh–Nagumo (FHN) equation, see [33], also describing phase transitions in porous media, see, e.g., [36, 40].

Remark 1.1

In what follows we shall focus on the mathematical setting behind the Stochastic FitzHugh–Nagumo (FHN) model, without entering into details about the neuro-biological justification of parameters characterizing it. Appropriate details, as well as in depth analysis of the existing literature on the subject, will be provided later.

We assume that W is a \(H^{-1}:=H^{-1}({\mathcal {O}})\)-cylindrical Wiener processes, such that

$$\begin{aligned} \begin{aligned} W(t,\xi )&= \sum _{n= 1}^\infty \mu _n e_n \beta _n(t) \,,\quad t \ge 0\,,\xi \in {\mathcal {O}}\,,\\ \end{aligned} \end{aligned}$$

where \(\{\beta _n\}_{n \ge 1}\) is a sequence of mutually independent standard Brownian motion defined on a filtered probability space \(\left( \varOmega ,{\mathcal {F}},\left\{ {\mathcal {F}}_t\right\} _{t \ge 0},{\mathbb {P}}\right) \), while \(\{e_n\}_{\ge 1}\) is an orthonormal basis in \(H^{-1}\) and \(\mu _n \in {\mathbb {R}}\).

Since the Laplacian operator \(\varDelta _\xi \) is a linear operator in \(L^2\left( {\mathcal {O}}\right) \), and \(-\varDelta _\xi \) is self-adjoint, then there exists a complete orthonormal system \(\{{\bar{e}}_k\}_{k\ge 1}\) in \(L^2\left( {\mathcal {O}}\right) \) of eigenfunctions of \(-\varDelta _\xi \), and we shall indicate the corresponding sequence of eigenvalues denoted by \(\{{\bar{\lambda }}_k\}_{k\ge 1}\). Therefore, we have

$$\begin{aligned} \varDelta _\xi {\bar{e}}_k = -{\bar{\lambda }}_k {\bar{e}}_k\,,\quad k \in {\mathbb {N}}\;. \end{aligned}$$

Also, we set

$$\begin{aligned} Gv = I^{ion}(v) = v(v-a)(v-1) \;, \end{aligned}$$
(2)

and note that G is monotonically nondecreasing.

The present paper addresses the problem of existence and uniqueness of a strong solution, in a sense to be better specified in a while, to Eq. (1). We stress that this is not a trivial problem as the nonlinear operator \(\varDelta \gamma \) is naturally defined on the space \(H^{-1}\) whereas the nonlinear polynomial perturbation \(I^{ion}\) is not m-accreative on the same space. In order to solve above problem we will transform the original equation, via a rescaling transformation, to a random PDE. It turns out that the existence and uniqueness of transformed random PDE can be treated by the theory of nonlinear semigroup in \(L^1\).

We will further consider the problem of existence of an optimal control for the nonlinear FHN equation. Again, in order to solve the problem we will apply a rescaling transformation to obtain a corresponding random PDE. As already emerged in [12, 22], the nonlinear polynomial term implies that standard minimization argument does not apply. Therefore, existence of an optimal control is achieved using Ekeland’s variational principle. First order conditions of optimality are given in terms of dual stochastic backward equation, see, e.g, [12, 17], whereas, due to the applied rescaling transformation are expressed in terms of a random backward dual equation which allows to simplify the setting also giving more insights on the derived optimal controller.

It is worth stressing that the present work continues the investigation of optimal control problem for a stochastic FHN system, generalizing further the result presented in [12, 22]. It must be stressed that the techniques used in the present work, although presenting some similarity with [12, 22], as for instance the usage of the Ekeland principle to treat the cubic nonlinearity typical of the FHN equation, are in general different. In fact, the nonlinear term \(\gamma \) poses several difficulties since its natural state space is \(H^{-1}\). As already mentioned, the cubic nonlinear term arising in the FHN equation is not sufficiently regular in such a space. Therefore, differently to the previous works [12, 22], existence and uniqueness of a solution for the main equation is non trivial. The nonlinear diffusion \(\gamma \) also affects the main technique used in proving the existence of an optimal control since a suitable trasnformation to reduce the problem to a random equation must be applied.

The present paper is structured as follows: Sect. 1.1 introduces main notation used thorough the paper. Section 2 addresses the problem of proving existence and uniqueness for the state equation whereas in Sect. 3 the problem of the existence for an optimal control is considered.

1.1 Main Notations

In what follows we will denote by \(| \cdot |\), resp. \(\langle \cdot ,\cdot \rangle \), the norm, resp. scalar product, on \({\mathbb {R}}^d\). Also, \(L^p\left( {\mathcal {O}}\right) =: L^p\), for \(1 \le p \le \infty \), is the standard space of p-Lebesgue measurable function over the domain \({\mathcal {O}}\subset {\mathbb {R}}^d\), with corresponding norm defined as \(|\cdot |_p\). For the case \(p=2\), we will further denote by \(\langle \cdot ,\cdot \rangle _2\) the scalar product in \(L^2\). The space \(H^1({\mathcal {O}})=: H^1\) is the Sobolev space \(\left\{ u \in L^2\,:\,\partial _\xi u(\xi ) \in L^2\right\} \), endowed with the standard norm \(\Vert u\Vert _{H^1}^2:= \int _{{\mathcal {O}}} \left( |u|^2 + |\nabla u|^2 \right) d\xi \). The dual of the space \(H^1\) will be denoted as \(H^{-1}\) equipped with corresponding norm \(|\cdot |_{-1}\).

Similarly, we will denote by \(W^{n,p}({\mathcal {O}})=:W^{n,p}\), \(n \in {\mathbb {N}}\), \(1\le p \le \infty \), the standard Sobolev space of p-integrable functions with p-integrable n-order derivatives. Coherently, \(W^{1,p}([0,T];H^{-1})\) will be the space of absolutely continuous function \(u:[0,T]\rightarrow H^{-1}\) such that both u and \(\frac{d}{dt}u \, \in L^p([0,T];H^{-1})\). Further, given a Banach space X, \(L^p([0,T];X)\) is the space of X-valued Bochner p-integrable functions on the interval [0, T]. Also, C([0, T]; X), resp. \(C^1([0,T];X)\), denotes the space of continuous, resp. continuously differentiable, functions \(u:[0,T]\rightarrow X\).

We shall also introduce \(C_W([0,T];H)\) the space of all \(H^{-1}\)-valued \(\left( {\mathcal {F}}_t\right) \)-adapted processes such that \(X \in C\left( [0,T];L^2\left( \varOmega ;H^{-1}\right) \right) \), that is X satisfies

$$\begin{aligned} \sup _{t \in [0,T]} {\mathbb {E}}|X(t)|^2_{-1} <\infty \,. \end{aligned}$$

In an analogous manner \(L^2_W([0,T];H^{-1})\) is the space of all \(H^{-1}\)-valued \(\left( {\mathcal {F}}_t\right) \)-adapted processes such that \(X \in L^2\left( [0,T];L^2\left( \varOmega ;H^{-1}\right) \right) \), that is X satisfies

$$\begin{aligned} \int _{0}^T {\mathbb {E}}|X(t)|^2_{-1} dt<\infty \,. \end{aligned}$$

At last \(L^2_W(\varOmega ;C\left( [0,T];H^{-1}\right) )\) denotes the space of all \(H^{-1}\)-valued \(\left( {\mathcal {F}}_t\right) \)-adapted and continuous processes such that

$$\begin{aligned} {\mathbb {E}} \sup _{t \in [0,T]} \left| X(t)\right| ^2_{-1}<\infty \,. \end{aligned}$$

Above definition are still in place if instead of \(H^{-1}\) we consider a general Hilbert space H. It is also known that there is a natural embedding of \(L^2_W(\varOmega ;C\left( [0,T];H^{-1}\right) )\) into the space \(C_W([0,T];H^{-1})\), see, e.g. [13, Chapter 1].

We therefore can rewrite Eq. (1) as

$$\begin{aligned} {\left\{ \begin{array}{ll} dX(t)-[\varDelta (\gamma ( X(t)))-G(X(t)) - fX(t) + F]dt =XdW(t),\\ X(0) = x_0 \in H^{-1}\, ,\quad t \in [0,T]\, , \end{array}\right. } \;. \end{aligned}$$
(3)

We will assume the following to hold.

Hypothesis 1

  1. (i)

    \(\gamma :{\mathbb {R}}\rightarrow {\mathbb {R}}\) with \(\gamma (0)=0\) is a continuous and differentiable function such that

    $$\begin{aligned} 0< C_1 \le \gamma '(x) \le C_2 < \infty \,,\quad \forall \, x \in {\mathbb {R}}\,; \end{aligned}$$
  2. (ii)

    \(G: {\mathbb {R}}\rightarrow {\mathbb {R}}\) is continuous monotonically non–decreasing and \(G(0) = 0\); moreover G is locally Lipschitzian;

  3. (iii)

    \(F \in L^\infty ((0,T)\times {\mathcal {O}})\), \({\mathbb {P}}\)-a.s. and it is progressively measurable w.r.t. \((0,T) \times \varOmega \times {\mathcal {B}}({\mathcal {O}})\); \(f \in L^\infty ({\mathcal {O}})\), and \(f \ge 0\) a.e. in \({\mathcal {O}}\);

  4. (iv)

    W is a \(H^{-1}:=H^{-1}({\mathcal {O}})\)-cylindrical Wiener processes, that is,

    $$\begin{aligned} W = \sum _{j=1}^\infty \mu _j e_j \beta _j\,, \end{aligned}$$

    with

    $$\begin{aligned} \sum _{j=1}^\infty \mu _j^2 |e_j|^2_{L^\infty ({\mathcal {O}})} < \infty \,, \end{aligned}$$

    see, [13, p. 22].

Then, we can state the notion of solution to Eq. (3) that we will consider in subsequent analysis, see [13, p. 50].

Definition 1.1

Let \(x \in H^{-1}\), we say that the process

$$\begin{aligned} X \in L^2_W \left( \varOmega ;C\left( [0,T];H^{-1}\right) \right) \cap L^2_W([0,T];L^2)\,, \end{aligned}$$

is a solution to (3) if \(X(t): [0,T] \rightarrow H^{-1}\) is \({\mathbb {P}}\)-a.s. continuous and \(\forall \) \(t \in [0,T]\)

$$\begin{aligned} X(t) = x + \int _0^t \left( \varDelta (\gamma (X(s))) - G(X(s)) + fX(s) + F(s)\right) ds + \int _0^t X(s) dW(s). \end{aligned}$$

2 Existence for the State Equation

The main problem in proving existence and uniqueness for a solution to Eq. (3) is that the operator G in not m-accretive on the space \(H^{-1}\) and so basic existence results in [11, 13] are not applicable in the present case. It turns out that the proper space one has to consider to successfully treat Eq. (3) is the space \(L^1\), which, in turn, is not the proper one if one has to deal with SPDEs such as (3).

To overcome such a stalemate, we follow [9, 10]. In particular, we apply the transformation \(X=e^Wy\), which allows to reduce the stochastic equation (3) to a random PDE that can be treated with analytical techniques. In fact, the random equation can be successfully solved by exploiting the theory of nonlinear semigroup in \(L^1\). As noted in [10], we have still to face the problem that, because of the non regularity of the term W, the general theory cannot be applied straightforward to the resulting random PDE. Therefore, for \(\epsilon >0\), we shall consider a suitable sequence of regular approximations \(W_\epsilon \) of W, to first establish a priori estimates for solutions \(y_\epsilon \) of the associated \(W_\epsilon \)-approximating problem, and then to show that, in the limit \(\epsilon \rightarrow 0\), we obtain both existence and uniqueness of the solution for the original equation.

The following theorem constitutes the main result existence of this section.

Theorem 2.1

Let \(x \in H^{-1}\cap L^1\) with \(\gamma (x) \in H^1_0\), then there is a unique strong solution to Eq. (3) \(X=e^Wy\) which satisfies

$$\begin{aligned} Xe^{-W} \in W^{1,2}([0,T];H^{-1}) \cap L^\infty ((0,T) \times {\mathcal {O}})\,,\quad {\mathbb {P}}-a.s. \end{aligned}$$

In order to prove Theorem 2.1 we need some auxiliary lemmas. In particular, let us then introduce the transformation

$$\begin{aligned} X(t)=e^{W(t)} y(t) \,,\quad t \ge 0\,, \end{aligned}$$
(4)

so that by an application of the Itô formula we obtain the random equation

$$\begin{aligned} \begin{aligned}&\frac{\partial }{\partial t} y + e^{-W} G(e^W y) - e^{-W} \varDelta \gamma (e^W y) + fy + \mu y = e^{-W}F \,\\&y(0,\xi ) = x(\xi )\,,\quad \xi \in {\mathcal {O}}\,,\\&y(t) \in H^1_0({\mathcal {O}})\,,\, t \in (0,T) \end{aligned} \end{aligned}$$
(5)

with

$$\begin{aligned} \mu = \frac{1}{2} \sum _{n=1}^\infty \mu _n^2 e_n^2\,, \end{aligned}$$

see, e.g. [7, 9, 10].

Following [10], we prove the existence of a unique strong solution to Eq. (5) by first considering an approximating problem. In particular, let us denote by \(\beta ^\epsilon (t) := (\beta * \rho _\epsilon )(t)\), where \(\rho _\epsilon (t) = \frac{1}{\epsilon }\rho \left( \frac{t}{\epsilon }\right) \) is a standard mollifier and \(\rho \in C^\infty _0\), then we have that \(\beta ^\epsilon \in C^1([0,T];{\mathbb {R}})\). Setting

$$\begin{aligned} W_\epsilon (t,\xi ) = \sum _{n= 1}^\infty \mu _n e_n \beta ^\epsilon (t) \,,\quad t \ge 0\,,\xi \in {\mathcal {O}}\,. \end{aligned}$$

we thus have that \(W_\epsilon \in C^1([0,T] \times {\mathcal {O}})\). Moreover

$$\begin{aligned} W_\epsilon (t,\xi ) \rightarrow W(t,\xi ) \,\quad \text{ uniformly } \text{ in } \quad (t,\xi ) \in [0,T]\times \xi \,, \end{aligned}$$

as \(\epsilon \rightarrow 0\).

For each \(\epsilon >0\), let us thus consider the approximating equation associated to (5)

$$\begin{aligned}&\frac{\partial }{\partial t} y_\epsilon + e^{-W_\epsilon } G_\epsilon (e^{W_\epsilon } y_\epsilon ) - e^{-W_\epsilon } \varDelta \left( \gamma (e^{W_\epsilon } y_\epsilon + \epsilon e^{W_\epsilon }y_\epsilon \right) + fy + \mu y_\epsilon = e^{-W_\epsilon } F\,\nonumber \\&y_\epsilon (0,\xi ) = x(\xi )\,,\quad \xi \in {\mathcal {O}}\,,\nonumber \\&y_\epsilon (t) \in H^1_0({\mathcal {O}})\,,\, t \in (0,T) \end{aligned}$$
(6)

where \(G_\epsilon \) is the Yosida approximation of G, that is

$$\begin{aligned} G_\epsilon := \frac{1}{\epsilon } \left( I - (I + \epsilon G)^{-1}\right) \,, \quad \epsilon >0\,. \end{aligned}$$
(7)

Note that, \(G_\epsilon \) is monotonically non–decreasing, Lipschitzian and

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} G_\epsilon (z) = G(z)\,,\, \forall z \in {\mathbb {R}} \end{aligned}$$

uniformly on compacts.

Defining \(z_\epsilon := e^{W_\epsilon }y_\epsilon \), Eq. (6) becomes

$$\begin{aligned}&\frac{\partial }{\partial t} z_\epsilon + G_\epsilon (z_\epsilon ) - \varDelta \left( \gamma (z_\epsilon )+\epsilon z_\epsilon \right) + fz + \left( \mu - \frac{\partial }{\partial t} W_\epsilon \right) z_\epsilon = F_\epsilon \,, \quad \text{ in } \, (0,T) \times {\mathcal {O}}\,,\nonumber \\&z_\epsilon (0,\xi ) = x(\xi )\,,\quad \xi \in {\mathcal {O}}\,,\nonumber \\&\gamma (z_\epsilon (t)) + \epsilon z_\epsilon (t)\in H^1_0({\mathcal {O}})\,,\, t \in (0,T) \end{aligned}$$
(8)

where \(F_\epsilon := e^{-W_\epsilon } F\).

Lemma 2.2

Let \(x \in H^{-1}\cap L^1\) with \(\gamma (x) \in H^1_0\), then for each \(\epsilon >0\) Eq. (6) has a unique solution such that

$$\begin{aligned} y_\epsilon \in W^{1,\infty } \left( [0,T];H^{-1}\right) \cap L^\infty \left( 0,T;H^1_0\right) . \end{aligned}$$

Proof

Let us first prove existence and uniqueness of a solution to Eq. (8) in the space \(H^{-1}\). For a fixed \(\epsilon >0\), let us define the operator \(A:D(A)\subset H^{-1}\rightarrow H^{-1}\) as

$$\begin{aligned} \begin{aligned} Az&= - \varDelta \left( \gamma (z) + \epsilon z\right) + fz + G_\epsilon (z) + \mu z\,,\\ D(A)&= \left\{ z \in L^2 \,:\, \gamma (z) \in H^1_0 \right\} \,., \end{aligned} \end{aligned}$$
(9)

We equip the space \(H^{-1}\) with the scalar product

$$\begin{aligned} \langle y,z\rangle _{-1} := {}_{H^1}\langle \left( -\varDelta \right) ^{-1} y,z\rangle _{H^{-1}} \,, \end{aligned}$$

where \(\left( -\varDelta \right) ^{-1} y = x\) indicates the solution to the Dirichlet problem \(-\varDelta x = y\) in \({\mathcal {O}}\), \(x \in H^1_0\).

Taking into account that \(G_\epsilon \) is Lipschitz continuous in \(L^2\) and since

$$\begin{aligned} z \mapsto -\varDelta (\gamma (z)+\epsilon z)\,, \end{aligned}$$

is m-accreative in the space \(H^{-1}\), see, e.g., [4, p. 68], we have that, for a suitable \(\alpha = \alpha _\epsilon \), it holds

$$\begin{aligned} \langle (A+\alpha I)z - (A+\alpha I){\bar{z}},z-{\bar{z}}\rangle _{-1} \ge 0\, \end{aligned}$$

which implies \((A+\alpha I)\) to be accretive in \(H^{-1}\).

Moreover, for \(\lambda >0\) sufficiently large, we also have \({\mathcal {R}}\left( (\lambda + \alpha )I + A\right) =H^{-1}\), so that A is quasi-m-accretive. In other words, for \(f \in H^{-1}\) the equation

$$\begin{aligned} (\lambda + \alpha )\left( - \varDelta \right) ^{-1}z + \gamma (z) + \epsilon z + \left( - \varDelta \right) ^{-1}\left( G_\epsilon (z) + fz -\mu z\right) = \left( - \varDelta \right) ^{-1}{\tilde{f}}\,, \end{aligned}$$
(10)

has a unique solution in \(z \in L^2\). Indeed, introducing the operators

$$\begin{aligned} \begin{aligned} B:L^2 \rightarrow L^2\,,\quad Bz := \gamma (z)\,, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\varGamma : L^2 \rightarrow L^2\,,\\&\quad \varGamma z = (\lambda + \alpha )\left( - \varDelta \right) ^{-1}z + \left( - \varDelta \right) ^{-1}\left( G_\epsilon (z) + fz -\mu z\right) \,, \end{aligned} \end{aligned}$$

we see that Eq. (10) can be rewritten as

$$\begin{aligned} \epsilon z + Bz + \varGamma z = \left( - \varDelta \right) ^{-1}{\tilde{f}}\,. \end{aligned}$$
(11)

Since B is m-accretive and \(\varGamma \) is m-accretive and continuous in \(L^2\), it follows, see, e.g., [4, p. 104], that \({\mathcal {R}}(\epsilon I + B + \varGamma )=L^2\), so that Eq. (11) admits a unique solution z in \(L^2\). Moreover, since \(\gamma (z) + \epsilon z \in H^1\) and the inverse map of \(z \mapsto \gamma (z) + \epsilon z\) is Lipschitz, then \(z \in D(A)\). It follows that, applying [10, Lemma A.1, Corollary A.2], see also [4, Sect. 4], \(z_\epsilon \) is a strong solution to Eq. (8) in \(W^{1;\infty }([0,T];H^{-1})\). In addition, by [10, Corollary A.2], we also have

$$\begin{aligned} \gamma (z_\epsilon ) + \epsilon z_\epsilon - \left( - \varDelta \right) ^{-1} G_\epsilon (z) \in L^\infty (0,T;H^1_0)\,, \end{aligned}$$

and

$$\begin{aligned} \left| \left( - \varDelta \right) ^{-1}G_\epsilon (z)\right| _2 \le C_\epsilon |z|_{-1}\,, \end{aligned}$$

so that, since \(z_\epsilon \in W^{1;\infty }([0,T];H^{-1})\), we obtain

$$\begin{aligned} \gamma (z_\epsilon ) + \epsilon z_\epsilon \in L^\infty (0,T;L^2)\,, \end{aligned}$$

and consequently \(z_\epsilon \in L^\infty (0,T;L^2)\). Moreover we have that

$$\begin{aligned} \left( - \varDelta \right) ^{-1}G_\epsilon (z_\epsilon ) \in L^\infty (0,T;H^1_0)\,, \end{aligned}$$

which implies that \(\gamma (z_\epsilon ) + \epsilon z_\epsilon \in L^\infty (0,T;H^1_0)\) and consequently \(z_\epsilon \in L^\infty (0,T;H^1_0)\). \(\square \)

Lemma 2.3

Taking \(x \in D(A)\), then \(y_\epsilon \in L^\infty ((0,T)\times {\mathcal {O}})\), and it holds

$$\begin{aligned} \begin{aligned} \sup _{\epsilon } \left\{ |y_\epsilon |_{L^\infty ((0,T)\times {\mathcal {O}})} \right\} \le C(1+|x|_\infty )\,. \end{aligned} \end{aligned}$$
(12)

Proof

Let \(\alpha \in C^1([0,T])\), such that \(\alpha (0) = 0\) and \(\alpha '\ge 0\). Then, defining \(M:= (1+|x|_\infty )\), we have

$$\begin{aligned} \begin{aligned}&\frac{\partial }{\partial t} \left( y_\epsilon - M - \alpha (t) \right) + e^{-W_\epsilon } \left( G_\epsilon \left( e^{W_\epsilon } (y_\epsilon )\right) - G_\epsilon \left( e^{W_\epsilon }(M + \alpha (t))\right) \right) \\&\qquad \qquad \quad \qquad - f \left( y_\epsilon -M-\alpha (t)\right) - \,e^{-W_\epsilon } \varDelta \left( \gamma (e^{W_\epsilon } y_\epsilon )+\epsilon e^{W_\epsilon }y_\epsilon \right) \\&\qquad \qquad \quad \qquad +\, e^{-W_\epsilon }\varDelta \left( \gamma \left( e^{W_\epsilon }\left( M+\alpha (t)\right) \right) + \epsilon e^{-W_\epsilon }\left( M+\alpha (t)\right) \right) \\&\qquad \qquad \quad \qquad +\, \mu (y_\epsilon -M-\alpha (t)) = {\tilde{F}}_\epsilon -\alpha '\,, \end{aligned} \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} {\tilde{F}}_\epsilon&:= - e^{-W_\epsilon }G_\epsilon (e^{W_\epsilon }(M + \alpha (t)))-\mu (M+\alpha (t))- f \left( M + \alpha (t)\right) \\&\qquad +\, e^{-W_\epsilon } \varDelta \gamma \left( e^{W_\epsilon }\left( M+\alpha (t)\right) \right) + \epsilon (M+\alpha (t))e^{-W_\epsilon }\varDelta (e^{W_\epsilon })+ e^{-W_\epsilon }F\,, \end{aligned} \end{aligned}$$

with \(\alpha \) such that \(F_\epsilon - \alpha ' \le 0\).

Following [10, Lemma 3.3], we first assume that

$$\begin{aligned} \frac{\partial }{\partial t} y_\epsilon \,,\quad \varDelta \left( \gamma (e^{W_\epsilon } y_\epsilon )+\epsilon e^{W_\epsilon }y_\epsilon \right) \in L^1((0,T)\times {\mathcal {O}})\,, \end{aligned}$$
(13)

then, denoting by

$$\begin{aligned} \begin{aligned} J(t):&= - \int _{{\mathcal {O}}} e^{-W_\epsilon }\left[ \left( \varDelta (\gamma (e^{W_\epsilon }y_\epsilon )+\epsilon e^{W_\epsilon }y_\epsilon \right) \right. \\&\quad - \, \left. \varDelta \left( \gamma \left( e^{W_\epsilon }\left( M+\alpha (t)\right) \right) + \epsilon e^{W_\epsilon }(M+\alpha (t))\right) \right] sign\left( y_\epsilon -M -\alpha (t)\right) ^+ d\xi \,, \end{aligned} \end{aligned}$$

we have

$$\begin{aligned} \int _0^t J(s) ds \ge - (\gamma '(e^{|W|_\infty }M)+1)e^{|W|_\infty }(|\varDelta W|_\infty + |\nabla W|_\infty ^2)\int _0^t |(y_\epsilon - (M+\alpha (s))))^+|_1 ds\,. \end{aligned}$$

Moreover, by Hypothesis 1, it follows that \(G_\epsilon \) is monotone, so that

$$\begin{aligned} \begin{aligned}&\int _0^t \int _{{\mathcal {O}}} e^{-W_\epsilon } \left( G_\epsilon (e^{W_\epsilon } (y_\epsilon ))- G_\epsilon (e^{W_\epsilon }(M + \alpha (t)))\right) sign(e^{-W_\epsilon }\left( y_\epsilon - M - \alpha (s)\right) )^+ ds d\xi \\&\quad =\int _0^t \int _{{\mathcal {O}}} e^{-W_\epsilon } \left( G_\epsilon \left( e^{W_\epsilon }\left( y_\epsilon \right) \right) - G_\epsilon \left( e^{W_\epsilon }\left( M + \alpha (t)\right) \right) \right) sign\left( G_\epsilon (e^{-W_\epsilon } y_\epsilon )\right. \\&\qquad \left. - G_\epsilon \left( e^{-W_\epsilon }\left( M + \alpha (s)\right) \right) \right) ^+ ds d\xi \ge 0\,, \end{aligned} \end{aligned}$$

and since

$$\begin{aligned} \begin{aligned}&\int _{{\mathcal {O}}} \frac{\partial }{\partial t}\left( y_\epsilon - M -\alpha (s)\right) sign\left( y_\epsilon - M -\alpha (s)\right) ^+ d\xi \\&\quad \quad \quad = \frac{d}{dt}|\left( y_\epsilon (t) - M -\alpha (t)\right) ^+|_1 \,, \text{ a.e. } t \in (0,T)\,, \end{aligned} \end{aligned}$$

by [10, Lemma 3.3], we conclude that

$$\begin{aligned} |\left( y_\epsilon (t) - M -\alpha (t)\right) ^+|_1 =0\,, \end{aligned}$$

if \({\tilde{F}}_\epsilon \le \alpha '\) a.e. in \((0,T)\times {\mathcal {O}}\). Moreover, for a suitable \(\alpha \), it also holds

$$\begin{aligned} y_\epsilon \le M+\alpha (t)\,,\quad \text{ a.e. }\, \text{ in } (0,T)\times {\mathcal {O}}\,, \end{aligned}$$

and

$$\begin{aligned} y_\epsilon \ge -M-\alpha (t)\,,\quad \text{ a.e. }\, \text{ in } (0,T)\times {\mathcal {O}}\,, \end{aligned}$$

and inequalities (12) follows.

Using the approximating scheme described in [10, Lemma 3.3], we have (12) without requiring the condition (13), and the claim follows. \(\square \)

Lemma 2.4

Let \(x \in D(A)\), then there exists an increasing function \(C:[0,\infty ) \rightarrow (0,\infty )\) such that

$$\begin{aligned} \sup _{t \in [0,T]} |y_\epsilon (t)|^2_2 + \int _0^T \int _{{\mathcal {O}}} |\nabla \gamma \left( e^{W_\epsilon }y_\epsilon \right) |^2 d\xi ds \le C\left( 1+|x|_\infty \right) \,,\quad \forall \epsilon \in (0,1]\,. \end{aligned}$$

Proof

In what follows we will use the following

$$\begin{aligned} \begin{aligned} \int _{{\mathcal {O}}} j(y_\epsilon (t)) d\xi =&\int _0^t {}_{H^{-1}} \left\langle \frac{d y_\epsilon }{ds}(s),\gamma (y_\epsilon (s))\right\rangle _{H^1_0} ds + \int _{{\mathcal {O}}} j(x) d\xi \,,\\ {}_{H^{-1}} \left\langle \varDelta \gamma (e^{W_\epsilon }y_\epsilon ), e^{-W_\epsilon }\gamma \left( y_\epsilon \right) \right\rangle =&- \int _{{\mathcal {O}}} \nabla \gamma \left( e^{W_\epsilon }y_\epsilon \right) \cdot \nabla \left( e^{-W_\epsilon }\gamma \left( y_\epsilon \right) \right) d\xi \,, \end{aligned}\,, \end{aligned}$$

with \(j(r) = \int _0^r \gamma (s)ds\), \(r \in {\mathbb {R}}^+\).

Thus, multiplying Eq. (6) by \(\gamma (y_\epsilon )\) and integrating over \((0,t) \times {\mathcal {O}}\) we obtain

$$\begin{aligned} \begin{aligned}&\int _{{\mathcal {O}}} j(y_\epsilon (t)) d\xi + \int _0^t \int _{{\mathcal {O}}} \left[ \nabla \gamma \left( e^{W_\epsilon }y_\epsilon \right) + \epsilon \nabla \left( e^{W_\epsilon }y_\epsilon \right) ) \cdot \nabla \left( \gamma \left( y_\epsilon \right) e^{-W_\epsilon }\right) \right] d\xi ds \\&\qquad \qquad \qquad \le \int _{{\mathcal {O}}} j(x) d\xi {-} \int _0^t \int _{{\mathcal {O}}} \left( e^{-W_\epsilon } G_\epsilon \left( e^{W_\epsilon } y_\epsilon \right) {+} f y_\epsilon - e^{-W_\epsilon } F\right) \gamma \left( y_\epsilon \right) d\xi ds\,. \end{aligned} \end{aligned}$$
(14)

Concerning the last integral in the right hand side of Eq. (14), using Assumption 1 (i) on \(\gamma \) we obtain

$$\begin{aligned} \begin{aligned}&\int _0^t \int _{{\mathcal {O}}} \left( e^{-W_\epsilon } G_\epsilon (e^{W_\epsilon } y_\epsilon ) + f y_\epsilon \right) \gamma \left( y_\epsilon \right) d\xi ds \\&\qquad \le C \int _0^t \int _{{\mathcal {O}}} \left( e^{-W_\epsilon } G_\epsilon \left( e^{W_\epsilon } y_\epsilon \right) + f y_\epsilon \right) y_\epsilon d\xi ds \,. \end{aligned} \end{aligned}$$
(15)

Using estimate (15) it follows, recalling that \(G_\epsilon \) is the Yosida approximant of G and using the monotonicity of \(\gamma \) and \(G_\epsilon \) that

$$\begin{aligned} \begin{aligned}&\int _0^t \int _{{\mathcal {O}}} \left( e^{-W_\epsilon } G_\epsilon \left( e^{W_\epsilon } y_\epsilon \right) + f y_\epsilon \right) \gamma \left( y_\epsilon \right) d\xi ds \\&\quad \ge C\int _0^t \int _{{\mathcal {O}}} e^{-W_\epsilon } G_\epsilon \left( e^{W_\epsilon } y_\epsilon \right) y_\epsilon d\xi ds \\&\quad \ge -C\int _0^t \int _{{\mathcal {O}}} |f||y_\epsilon |^2 d\xi ds \,. \end{aligned} \end{aligned}$$
(16)

From the boundedness on F and assumption on \(\gamma \) in hypothesis 1\((i)-(ii)\), we obtain

$$\begin{aligned} \int _0^t \int _{{\mathcal {O}}} F \gamma \left( y_\epsilon \right) d\xi ds \le C (1+|x|^2_\infty ) \,, \end{aligned}$$
(17)

for a positive constant C independent of \(\epsilon \).

The other terms in Eq. (14) can be studied as done in [10, Lemma 3.3], so that the claim follows by Lemma 2.3. \(\square \)

Lemma 2.5

There is a unique solution to Eq. (5) with

$$\begin{aligned} \begin{aligned}&y \in W^{1,2}\left( [0,T];H^{-1}\right) \cap L^\infty \left( (0,T) \times {\mathcal {O}}\right) \,,\\&\quad \gamma \left( e^W y\right) \in L^2\left( 0,T;H^1_0\right) \,. \end{aligned} \end{aligned}$$
(18)

Moreover, the process y is \(\left( {\mathcal {F}}_t\right) _{t \ge 0}\)-adapted.

Proof

Let us first prove uniqueness. Let \(y_1\) and \(y_2\) be two solutions to Eq. (5), and let \({\bar{y}} := y_1 - y_2\). Then it holds

$$\begin{aligned} \begin{aligned}&\frac{\partial }{\partial t} {\bar{y}} + e^{-W} \left( G\left( e^W y_1\right) -G\left( e^W y_2\right) \right) + f {\bar{y}} \\&\qquad - e^{-W} \varDelta \left( \gamma \left( e^W y_1\right) - \gamma \left( e^W y_2\right) \right) + \mu {\bar{y}} = 0\, \quad \text{ in } \quad (0,T)\times {\mathcal {O}}\\&\quad {\bar{y}}(0,\xi ) = 0\,,\quad \xi \in {\mathcal {O}}\,.\\ \end{aligned} \end{aligned}$$
(19)

We can rewrite Eq. (19) as

$$\begin{aligned} \begin{aligned}&\frac{\partial }{\partial t} {\bar{y}} + (-\varDelta )({\bar{y}}\eta )= - e^{-W} \left( G\left( e^W y_1\right) -G\left( e^W y_2\right) \right) - e^{W} \varDelta \left( e^{-W}\right) {\bar{y}}\eta \\&\quad - \, f {\bar{y}} - 2 \nabla \left( e^{-W}\right) \cdot \nabla \left( e^W {\bar{y}}\eta \right) - \mu {\bar{y}} = 0\,, \end{aligned} \end{aligned}$$
(20)

where we have denoted for short

$$\begin{aligned} \eta := {\left\{ \begin{array}{ll} \frac{\left( \gamma (e^W y_1) - \gamma \left( e^W y_2\right) \right) }{e^W {\bar{y}}} \left\{ (t,\xi ) :\, {\bar{y}}(t,\xi ) \not = 0\right\} \,,\\ 0 \left\{ (t,\xi ) \,:\, \,\,\qquad \qquad \qquad \qquad {\bar{y}}(t,\xi ) = 0\right\} \,.\\ \end{array}\right. } \end{aligned}$$

Multiplying Eq. (20) by \((-\varDelta )^{-1}{\bar{y}}\), we obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{2}|{\bar{y}}|_{-1}^2 + \int _0^t \int _{{\mathcal {O}}} \eta {\bar{y}}^2 ds d\xi \\&\quad = \int _0^t \int _{{\mathcal {O}}} e^{-W} \left( G\left( e^W y_1\right) -G\left( e^W y_2\right) \right) (-\varDelta )^{-1}{\bar{y}} ds d\xi \\&\qquad - \int _0^t \int _{{\mathcal {O}}} e^W \varDelta \left( e^{-W}\right) {\bar{y}}\eta (-\varDelta )^{-1}{\bar{y}} ds d\xi \\&\qquad -2 \int _0^t \int _{{\mathcal {O}}} \nabla \left( e^{-W}\right) \cdot \nabla \left( e^W {\bar{y}}\eta \right) (-\varDelta )^{-1}{\bar{y}} ds d\xi \\&\qquad -\int _0^t \int _{{\mathcal {O}}} f {\bar{y}}(-\varDelta )^{-1}{\bar{y}} ds d\xi - \int _0^t \int _{{\mathcal {O}}} \mu {\bar{y}}(-\varDelta )^{-1}{\bar{y}} ds d\xi . \end{aligned} \end{aligned}$$
(21)

Concerning the first integral in the right hand side of Eq. (21), notice that, for \(\alpha \in [0,1]\) it holds

$$\begin{aligned} G\left( e^W y_1\right) -G\left( e^W y_2\right) = G'\left( \alpha e^W y_1 + (1-\alpha ) e^W y_2\right) e^W {\bar{y}}\,. \end{aligned}$$

Moreover, since G is locally Lipschitz, we have

$$\begin{aligned} \begin{aligned} \left| \int _{{\mathcal {O}}} e^{-W} \left( G(e^W y_1)-G(e^W y_2)\right) (-\varDelta )^{-1}{\bar{y}} d\xi \right| \le C |{\bar{y}}|_2 |{\bar{y}}|_{-1}\,, \end{aligned} \end{aligned}$$

whereas other terms can be treated as in [10, Theorem 2.2]. So that, we have

$$\begin{aligned} \frac{d}{dt}|{\bar{y}}|_{-1}^2 \le C |{\bar{y}}|_{-1}^2 \,, \, \text{ a.e. }\, t >0\,, \end{aligned}$$

from which it follows that \({\bar{y}} = 0\), and, by Lemma 2.4, it holds

$$\begin{aligned} |y(t)|_\infty + \int _0^t \int _{{\mathcal {O}}} |\nabla \gamma \left( y(s)\right) |^2 d\xi ds \le C\left( 1+ |x|_\infty \right) \,, \end{aligned}$$

so that, see [10, Theorem 2.2], we further have

$$\begin{aligned} \begin{aligned} y \in W^{1,2}\left( [0,T];H^{-1}\right) \cap L^\infty \left( (0,T) \times {\mathcal {O}}\right) \,. \end{aligned} \end{aligned}$$

As regard existence, by Lemmas 2.3 and 2.4, we have that \((\gamma (e^{W_\epsilon }y_\epsilon ))\) is bounded in \(L^2(0,T;H^1_0)\), \((y_\epsilon )\) is bounded in \(L^\infty (0,T;L^2) \cap L^\infty ((0,T) \times {\mathcal {O}}) \cap L^2(0,T;H^1_0)\), and \(\left( \frac{d y_\epsilon }{dt}\right) \) is bounded in \(L^2(0,T;H^{-1})\). Thus, by Aubin compactness theorem, \((y_\epsilon )\) is compact in each \(L^2(0,T;L^2({\mathcal {O}}))\). It follows that, for fixed \(\omega \in \varOmega \), along a subsequence, which we still denote by \(\{\epsilon \} \rightarrow 0\) for the sake of clarity, we have

$$\begin{aligned} \begin{aligned} y_\epsilon \rightarrow y&\quad \text{ strongly }\, \text{ in } \, L^2((0,T);L^2)\,,\\&\quad \text{ weak-star }\, \text{ in } \, L^2((0,T);L^2)\,,\\&\quad \text{ strongly }\, \text{ in } \, L^\infty ((0,T)\times {\mathcal {O}})\,,\\&\quad \text{ weakly }\, \text{ in } \, L^2((0,T);H^1_0)\,,\\ \gamma \left( e^{W_\epsilon }y_\epsilon \right) \rightarrow \eta&\quad \text{ weakly }\, \text{ in } \, L^2((0,T);H^1_0)\,,\\ \frac{d y_\epsilon }{dt} \rightarrow \frac{d y}{dt}&\quad \text{ weakly }\, \text{ in } \, L^2((0,T);H^{-1})\,,\\ W_\epsilon \rightarrow W&\quad \text{ in } \, C((0,T)\times {\mathcal {O}})\,.\\ \end{aligned} \end{aligned}$$
(22)

Since the map \(z \mapsto \gamma (z)\) is maximal monotone, by (22) we have that \(\eta = \gamma (e^Wy)\).

Then, since it holds

$$\begin{aligned} |\left( 1+\epsilon G\right) ^{-1}\left( e^{W_\epsilon } y_\epsilon \right) - e^{W_\epsilon } y_\epsilon |\le \epsilon |G_\epsilon \left( e^{W_\epsilon } y_\epsilon \right) | \le C \epsilon \quad \text{ a.e. }\, \text{ in }\quad (0,T) \times {\mathcal {O}}\,, \end{aligned}$$

and

$$\begin{aligned} \left( 1+\epsilon G\right) ^{-1}\left( e^{W_\epsilon } y_\epsilon \right) \rightarrow y \quad \text{ strongly }\, \text{ in } \, L^2\left( (0,T);L^2\right) \quad \text{ and } \quad \text{ a.e. }\, \text{ in }\quad (0,T) \times {\mathcal {O}}\,, \end{aligned}$$

then, for \(\epsilon \rightarrow 0\), we get

$$\begin{aligned} \begin{aligned} G_\epsilon \left( e^{W_\epsilon } y_\epsilon \right) \rightarrow G(y) \quad \text{ weakly }\, \text{ in } \, L^2\left( (0,T);H^1_0 \right) \quad \text{ and } \quad \text{ a.e. }\, \text{ in }\quad (0,T) \times {\mathcal {O}}\,. \end{aligned} \end{aligned}$$
(23)

Thus, again from the fact that \(G:{\mathbb {R}}\rightarrow {\mathbb {R}}\) is maximal monotone it follows that it is also closed and therefore we have that \(\zeta = G(e^W y)\).

Therefore, by letting \(\epsilon \rightarrow 0\), from Eq. (6) we obtain

$$\begin{aligned} \begin{aligned}&\frac{dy}{dt} + e^{-W} G(e^Wy) + fy - e^{-W} \varDelta \gamma (e^Wy) + \mu y = e^{-W} F\,,\quad \text{ in } \, (0,T) \times {\mathcal {O}}\,,\\&\qquad \qquad \qquad y(0)=x\,. \end{aligned} \end{aligned}$$

Then, by the uniqueness result already proved, we also have that the sequence \((y_\epsilon )\) is independent of \(\omega \in \varOmega \), implying that y is \(\left( {\mathcal {F}}_t\right) \)-adapted, ending the proof. \(\square \)

We can finally prove that it exists a unique strong solution X to Eq. (3) which satisfies

$$\begin{aligned} Xe^{-W} \in W^{1,2}([0,T];H^{-1})\,\quad {\mathbb {P}}-a.s. \end{aligned}$$

Proof of Theorem 2.1

Using [9, Lemma 8.1] we have the equivalence between the stochastic PDE 3 and the random PDE 5 via the rescaling transformation 4, so that existence and uniqueness of a solution X in the sense of Definition 1.1 follows by Lemma 2.5. \(\square \)

3 The Optimal Control Problem

In this section we will focus the attention to a controlled version of Eq. (1). We denote by \({\mathcal {X}}=L^2_{ad}\left( (0,T) \times {\mathcal {O}}\right) \) the space of all \({\mathscr {F}}_t- \)adapted processes \(u : [0,T ] \rightarrow {\mathbb {R}}^d\), and we consider the following optimal control problem

$$\begin{aligned} \text{ Minimize } \, {\mathbb {E}} \left[ \int _0^T \int _{{\mathcal {O}}} \left| v(t,\xi ) -v_1(\xi )\right| ^2 + \frac{\alpha }{2} |u(t,\xi )|^2 d\xi dt + \int _{{\mathcal {O}}} \left| v(T,\xi ) -v_2(\xi )\right| ^2 d\xi \right] \,, \end{aligned}$$
(P)

subject to \(u \in {\mathcal {U}}\) and

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t v(t,\xi ) - \varDelta \gamma (v(t,\xi )) + I^{ion} (v(t,\xi )) + f(\xi ) v(t,\xi ) = u(t,\xi ) + v(t,\xi ) \partial _t W(t) \;, \text{ in }\, (0,T)\times {\mathcal {O}}\\ v(0,\xi ) =v_0(\xi )\,,\quad \xi \in {\mathcal {O}}\;, \\ v(t,\xi ) = 0 \;, \text {on} \; (0,T) \times \partial {\mathcal {O}} \;. \end{array}\right. } \end{aligned}$$
(24)

Here

$$\begin{aligned} \begin{aligned} {\mathcal {U}}&:= \left\{ u \in L^2_{ad}((0,T)\times {\mathcal {O}}\times \varOmega ) \right. :\\&\left. \quad \,|u(t,\xi ,\omega )|\le M\,\quad \text{ a.e. } \left( t,\xi ,\omega \right) \in (0,T)\times {\mathcal {O}}\times \varOmega \right\} \,, \end{aligned} \end{aligned}$$

\(M>0\) being a suitable constant, while \(v_1\), \(v_2 \in L^2(\varOmega )\) \({\mathcal {F}}_0\)-adapted and \(\alpha >0\) are given.

In what follows we are going to treat the problem (P) by a rescaling procedure which allows us to reduce it to a random optimal control problem. In the current section the following Hypothesis will be assumed to hold.

Hypothesis 2

  1. (i)

    \(G \in C^1({\mathbb {R}})\), \(G'\) is locally Lipschitz.

Theorem 3.1

Let Hypotheses 1 and 2 hold, then, for T sufficiently small, there exists at least one optimal pair \( \left( u^*,v^*\right) \) solution to problem (P).

Proof

As in Sect. 2, we will apply the rescaling transformation \(y := e^{-W}v\) so that the optimal control problem (P) reads,

$$\begin{aligned} \begin{aligned}&\text{ Minimize } \, {\mathbb {E}} \left[ \int _0^T \int _{{\mathcal {O}}} \left| e^W y(t,\xi ) -v_1(\xi )\right| ^2 + \frac{\alpha }{2} |u(t,\xi )|^2 d\xi dt \right] \\&\qquad \qquad \qquad +\, {\mathbb {E}} \left[ \int _{{\mathcal {O}}} \left| e^W y((T,\xi ) -v_2(\xi )\right| ^2 d\xi \right] \,, \end{aligned} \end{aligned}$$
(P2)

subject to

$$\begin{aligned} \begin{aligned}&\partial _t y - e^{-W} \varDelta \gamma (e^W y) + e^{-W} G(e^W y)+fy+\mu y = e^{-W}u \;, \text{ in }\, (0,T)\times {\mathcal {O}}\,,\\&\quad y = 0 \,,\quad \text{ in } \quad (0,T) \times \partial {\mathcal {O}}\,. \end{aligned} \end{aligned}$$
(25)

Due to the cubic nonlinear term G, existence and uniqueness of an optimal control cannot be established by standard minimization arguments. In order to overcome this problem Ekeland’s variational principle can be exploited in order to obtain a nearly optimal solution to the above control problem.

Applying Ekeland’s variational principle, see, e.g. [28] or also [12, 22], there exists a sequence \(\{u_\epsilon \}\subset {\mathcal {U}}\) such that, defining for short,

$$\begin{aligned} \begin{aligned} \varPsi (u) :=&{\mathbb {E}} \left[ \int _0^T \int _{{\mathcal {O}}} \left| e^W y(t,\xi ) -v_1(\xi )\right| ^2 + \frac{\alpha }{2} |u(t,\xi )|^2 d\xi dt\right] \\&+{\mathbb {E}} \left[ \int _{{\mathcal {O}}} \left| e^W y((T,\xi ) -v_2(\xi )\right| ^2 d\xi \right] \,, \end{aligned} \end{aligned}$$

it follows

$$\begin{aligned} \begin{aligned} \varPsi \left( u_\epsilon \right)&\le \inf \left\{ \varPsi (u) \, ; u \in {\mathcal {U}}\right\} +\epsilon \, ,\\ \varPsi \left( u_\epsilon \right)&\le \varPsi (u) + \sqrt{\epsilon }\left| u_\epsilon - u\right| \, ,\quad \forall \, u \in {\mathcal {U}}\, , \end{aligned} \end{aligned}$$
(26)

or equivalently it holds

$$\begin{aligned} u_\epsilon = \arg \min _{u \in {\mathcal {U}}} \{ \varPsi (u) + \sqrt{\epsilon }\left| u_\epsilon - u\right| _{{\mathcal {U}}} \}\, . \end{aligned}$$
(27)

In particular, the optimal pair \((y_\epsilon ,u_\epsilon )\) solves

$$\begin{aligned} \begin{aligned}&\text{ Minimize } \, {\mathbb {E}} \left[ \int _0^T \int _{{\mathcal {O}}} \left| e^W y(t,\xi ) -v_1(\xi )\right| ^2 + \frac{\alpha }{2} |u(t,\xi )|^2 d\xi dt \right] \\&\quad +\, {\mathbb {E}} \left[ \int _{{\mathcal {O}}} \left| e^W y((T,\xi ) -v_2(\xi )\right| ^2 d\xi \right] + \sqrt{\epsilon }\left| u_\epsilon - u\right| _{{\mathcal {U}}} \,. \end{aligned} \end{aligned}$$
(28)

A variational argument alike the one carried out in [12, 22], allow us to associate the the optimal control problem (26) the dual backward problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t y_\epsilon - e^{-W} \varDelta \gamma \left( e^W y_\epsilon \right) + e^{-W} G\left( e^W y_\epsilon \right) + fy_\epsilon +\mu y_\epsilon = \frac{1}{\alpha } e^{-W}\left( p_\epsilon + \theta _\epsilon \right) \,\\ \partial _t p_\epsilon + e^{W} \gamma '\left( e^W y_\epsilon \right) \varDelta \left( e^{-W}p_\epsilon \right) + e^W G'\left( e^W y_\epsilon \right) p_\epsilon - f p_\epsilon - \mu p_\epsilon = 2\left( e^W y_\epsilon -v_1\right) \,\\ y_\epsilon (0) = y_0 \,,\quad p_\epsilon (T) = 2 (e^W y_\epsilon (T) - v_2)\,,\\ \end{array}\right. } \end{aligned}$$
(29)

where \(|\theta _\epsilon |_{L^2(\varOmega \times {\mathcal {O}}\times (0,T))}\le \sqrt{\epsilon }\). Indeed, by (26), it follows \(\varPsi '(u_\epsilon ) = \theta _\epsilon \), yielding (29). Existence and uniqueness for a solution to Eq. (25) follows from Lemma 2.5 whereas standard arguments yield, using assumption 2, that it exists a unique solution \(p_\epsilon \in L^\infty ((0,T) \times {\mathcal {O}}\times \varOmega )\), see [5].

By Eq. (29) we have \({\mathbb {P}}-a.s.\),

$$\begin{aligned} \begin{aligned}&\partial _t \left( y_\epsilon - y_\lambda \right) - e^{-W} \varDelta \left( \gamma (e^W y_\epsilon ) - \gamma (e^W y_\lambda )\right) \\&\qquad - e^{-W} \left( G\left( e^W y_\epsilon \right) - G\left( e^W y_\lambda \right) \right) +f \left( y_\epsilon - y_\lambda \right) +\mu \left( y_\epsilon - y_\lambda \right) =\\&\quad = \frac{1}{\alpha } e^{-W}\left( \left( p_\epsilon -p_\lambda \right) + \left( \theta _\epsilon -\theta _\lambda \right) \right) \,.\\ \end{aligned} \end{aligned}$$
(30)

Therefore multiplying Eq. (30) by \((y_\epsilon - y_\lambda )\) and integrating over \({\mathcal {O}}\), we therefore obtain in virtue of Lemma 2.4,

$$\begin{aligned} \begin{aligned}&\frac{1}{2}|y_\epsilon - y_\lambda |_{2}^2 + \int _0^t \int _{{\mathcal {O}}} \left| \nabla \left( \gamma \left( y_\epsilon \right) - \gamma \left( y_\lambda \right) \right) \right| ^2 d\xi ds \\&\quad = - \int _0^t \int _{{\mathcal {O}}} \left( G\left( e^W y_\epsilon \right) -G\left( e^W y_\lambda \right) \right) (y_\epsilon - y_\lambda )d\xi ds \\&\qquad -\int _0^t \int _{{\mathcal {O}}} f \left( y_\epsilon - y_\lambda \right) \left( y_\epsilon - y_\lambda \right) d\xi ds \\&\qquad -\int _0^t \int _{{\mathcal {O}}} \mu \left( y_\epsilon - y_\lambda \right) \left( y_\epsilon - y_\lambda \right) d\xi ds\\&\qquad +\int _0^t \int _{{\mathcal {O}}} \frac{e^{-W}}{\alpha } \left( p_\epsilon -p_\lambda \right) \left( y_\epsilon - y_\lambda \right) d\xi ds \\&\qquad +\int _0^t \int _{{\mathcal {O}}} \frac{e^{-W}}{\alpha } \left( \theta _\epsilon -\theta _\lambda \right) \left( y_\epsilon - y_\lambda \right) d\xi ds \,. \end{aligned} \end{aligned}$$
(31)

The last two terms in Eq. (31) can be treated exploiting the Young inequality, yielding

$$\begin{aligned} \begin{aligned} \left| \int _{{\mathcal {O}}} \frac{e^{-W}}{\alpha } \left( p_\epsilon -p_\lambda \right) \left( y_\epsilon - y_\lambda \right) d\xi \right|&\le C |p_\epsilon -p_\lambda |_2^2 |y_\epsilon - y_\lambda |_{2}^2 \\&\le C_1 |p_\epsilon -p_\lambda |_2^2 + C_2 |y_\epsilon - y_\lambda |_{2}^2\,,\\ \left| \int _{{\mathcal {O}}} \frac{e^{-W}}{\alpha } \left( \theta _\epsilon -\theta _\lambda \right) \left( y_\epsilon - y_\lambda \right) d\xi \right|&\le C |\theta _\epsilon -\theta _\lambda |_2^2 |y_\epsilon - y_\lambda |_{2}^2 \\&\le C |y_\epsilon - y_\lambda |_{2}^2 + \epsilon + \lambda \,, \end{aligned} \end{aligned}$$
(32)

as to obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{2} |y_\epsilon (t) - y_\lambda (t)|_{2}^2 + \int _0^t \left| \nabla \left( \gamma \left( y_\epsilon \right) - \gamma (y_\lambda )\right) \right| _2^2 ds \\&\quad \le C_1 |y_\epsilon (t) -y_\lambda (t)|_{2}^2 + C_2 |p_\epsilon (t) -p_\lambda (t)|_{2}^2 + \epsilon + \lambda \,. \end{aligned} \end{aligned}$$
(33)

Applying the Gronwall lemma and taking the mean value, we obtain using assumption 1 on \(\gamma \)

$$\begin{aligned} \begin{aligned} {\mathbb {E}} |y_\epsilon (t) - y_\lambda (t)|_{2}^2 + {\mathbb {E}} \int _0^t \left| \nabla \left( y_\epsilon - y_\lambda \right) \right| _2^2 ds\le C {\mathbb {E}} \int _0^t |p_\epsilon (s) -p_\lambda (s)|_{2}^2 ds + \epsilon + \lambda \,. \end{aligned} \end{aligned}$$
(34)

Regarding the second equation in (29), we obtain

$$\begin{aligned} \begin{aligned}&\partial _t \left( p_\epsilon - p_\lambda \right) + e^{W} \left( \gamma '\left( e^W y_\epsilon \right) \varDelta \left( e^{-W} p_\epsilon \right) - \gamma '\left( e^W y_\epsilon \right) \varDelta \left( e^{-W}p_\lambda \right) \right) \\&\qquad \quad +\,e^W \left( G'\left( e^W y_\epsilon \right) p_\epsilon - G'\left( e^W y_\lambda \right) p_\lambda \right) -f \left( p_\epsilon - p_\lambda \right) -\mu \left( p_\epsilon - p_\lambda \right) \\&\qquad = 2\left( e^W \left( y_\epsilon -y_\lambda \right) \right) \,.\\ \end{aligned} \end{aligned}$$
(35)

Then, multiplying Eq. (35) by \((p_\epsilon - p_\lambda )\) and integrating over \({\mathcal {O}}\), we obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{2}|p_\epsilon (t) - p_\lambda (t)|_2^2 = |e^{W}\left( y_\epsilon (T) - y_\lambda (T)\right) |_2^2 \\&\quad \int _t^T \int _{{\mathcal {O}}} e^{W} \left( \gamma '\left( e^W y_\epsilon \right) \varDelta \left( e^{-W}p_\epsilon \right) - \gamma '\left( e^W y_\lambda \right) \varDelta \left( e^{-W}p_\lambda \right) \right) \left( p_\epsilon (s) - p_\lambda (s)\right) d\xi ds \\&\qquad +\int _t^T\int _{{\mathcal {O}}} e^W \left( G'\left( e^W y_\epsilon \right) p_\epsilon (s) - G'\left( e^W y_\lambda \right) p_\lambda (s)\right) \left( p_\epsilon (s) - p_\lambda (s)\right) d\xi ds\\&\qquad -\int _t^T \int _{{\mathcal {O}}}(f+\mu ) (p_\epsilon (s) - p_\lambda (s))^2 d\xi ds\\&\qquad -\int _t^T \int _{{\mathcal {O}}} 2\left( e^W \left( y_\epsilon -y_\lambda \right) \right) \left( p_\epsilon (s) - p_\lambda (s)\right) d\xi ds \,. \end{aligned} \end{aligned}$$

Rearranging terms above, we further have

$$\begin{aligned}&\frac{1}{2}|p_\epsilon (t) - p_\lambda (t)|_2^2 = |e^{W}\left( y_\epsilon (T) - y_\lambda (T)\right) |_2^2 \nonumber \\&\quad \int _t^T \int _{{\mathcal {O}}} e^{W} \gamma '\left( e^W y_\epsilon \right) \left( \varDelta \left( e^{-W}p_\epsilon \right) - \varDelta \left( e^{-W}p_\lambda \right) \right) \left( p_\epsilon (s) - p_\lambda (s)\right) d\xi ds \nonumber \\&\qquad \left. +\int _t^T \int _{{\mathcal {O}}} e^{W} \varDelta \left( e^{-W}p_\lambda \right) \left( \gamma '\left( e^W y_\epsilon \right) - \gamma '\left( e^W y_\lambda \right) \right) \right) \left( p_\epsilon (s) - p_\lambda (s)\right) d\xi ds \nonumber \\&\qquad +\int _t^T\int _{{\mathcal {O}}} e^W G'\left( e^W y_\epsilon \right) \left( p_\epsilon (s) - p_\lambda (s)\right) ^2 d\xi ds\nonumber \\&\qquad +\int _t^T\int _{{\mathcal {O}}} e^W p_\lambda (s) \left( G'\left( e^W y_\epsilon \right) - G'\left( e^W y_\lambda \right) \right) \left( p_\epsilon (s) - p_\lambda (s)\right) d\xi ds\nonumber \\&\qquad -\int _t^T \int _{{\mathcal {O}}}(f+\mu ) \left( p_\epsilon (s) - p_\lambda (s)\right) ^2 d\xi ds\nonumber \\&\qquad -\int _t^T \int _{{\mathcal {O}}} 2\left( e^W \left( y_\epsilon -y_\lambda \right) \right) \left( p_\epsilon (s) - p_\lambda (s)\right) d\xi ds \,. \end{aligned}$$
(36)

Taking into account that \(u \in {\mathcal {U}}\), it follows exploiting Lemma 2.3 and using the Young inequality,

$$\begin{aligned} \begin{aligned}&|p_\epsilon (t) - p_\lambda (t)|_2^2 \le |y_\epsilon (T) - y_\lambda (T)|_2^2 \\&\quad C \int _t^T |p_\epsilon (s) - p_\lambda (s)|_2^2 ds + C \int _t^T |p_\lambda (s)|_2 |p_\epsilon (s) - p_\lambda (s)|_2 ds \\&\qquad +C \int _t^T |y_\epsilon (s) - y_\lambda (s)|_2 |p_\lambda (s)|_2 |p_\epsilon (s) - p_\lambda (s)|_2 ds \\&\quad \le C\left( \int _t^T |p_\epsilon (s) - p_\lambda (s)|_2^2 ds + \int _t^T |y_\epsilon (s) - y_\lambda (s)|_2^2 ds\right) \,, \end{aligned} \end{aligned}$$
(37)

where we have used C to denote possibly different constants as to simplify notation.

Taking the expectation in Eq. (37) and combining it with Eq. (34), we thus have

$$\begin{aligned} \begin{aligned}&{\mathbb {E}} |y_\epsilon (t) - y_\lambda (t)|_{2}^2 + {\mathbb {E}} |p_\epsilon (t) - p_\lambda (t)|_2^2 \\&\quad \le C \left( {\mathbb {E}} \int _0^t |p_\epsilon (s) -p_\lambda (s)|_{2}^2 ds + \epsilon + \lambda \right) \\&\qquad +C\left( {\mathbb {E}} \int _t^T |p_\epsilon (s) - p_\lambda (s)|_2^2 ds + {\mathbb {E}} \int _t^T |y_\epsilon (s) - y_\lambda (s)|_2^2 ds\right) \,, \end{aligned} \end{aligned}$$
(38)

so that, if T is small enough, we can infer that

$$\begin{aligned} \begin{aligned} {\mathbb {E}} |y_\epsilon (t) - y_\lambda (t)|_{2}^2 + {\mathbb {E}} |p_\epsilon (t) - p_\lambda (t)|_2^2 \le C \left( \epsilon + \lambda \right) \,, \end{aligned} \end{aligned}$$
(39)

implying that \((y_\epsilon ,p_\epsilon )\) is a Cauchy sequence, therefore ,along a subsequence still denoted by \(\{\epsilon \} \rightarrow 0\) for the sake of clarity, we have

$$\begin{aligned} \begin{aligned}&y_\epsilon \rightarrow y \quad \text{ weakly }\, \text{ in } \, L^2((0,T);H^1_0)\,,\\&p_\epsilon \rightarrow p \quad \text{ weakly }\, \text{ in } \, L^2((0,T)\times {\mathcal {O}})\,,\\&u_\epsilon := \frac{1}{\alpha } e^{-W}(p_\epsilon + \theta _\epsilon ) \rightarrow u^* \quad \text{ weakly }\, \text{ in } \, L^2((0,T)\times {\mathcal {O}}\times \varOmega )\,. \end{aligned} \end{aligned}$$
(40)

Further, arguments analogous to the ones used in the proof of Lemma 2.5, it follows from the fact that G is maximal monotone, that for \(\epsilon \rightarrow 0\), we get

$$\begin{aligned} G \left( e^{W_\epsilon } y_\epsilon \right) \rightarrow G(e^W y) \quad \text{ weakly }\, \text{ in } \, L^2\left( (0,T);H^1_0\right) \,. \end{aligned}$$
(41)

Letting then \(\epsilon \rightarrow 0\) in the first equation in (29), we have, denoting by \(y^*\) the solution corresponding to the optimal control \(u^*\),

$$\begin{aligned} \partial _t y^* - e^{-W} \varDelta \gamma \left( e^W y^*\right) - e^{-W} G\left( e^W y^*\right) +\mu y^* = e^{-W} u^*\,, \end{aligned}$$

hence, since \(\varPsi \) is lower–semicontinuous, previous computations give us:

$$\begin{aligned} \varPsi (u^*) = \inf _{u \in {\mathcal {U}}} \varPsi (u)\,, \end{aligned}$$

and the claimed existence result follows. \(\square \)

Theorem 3.2

(Necessary condition of optimality) Let be \((v^*,u^*)\) an optimal pair for problem (P), then if \(\alpha > 0\) it holds

$$\begin{aligned} u^*(t,\xi ) = \frac{1}{\alpha } P_U\left( -e^{-W}p(t,\xi )\right) \,,\quad \text{ a.e. } \text{ on } \quad (0,T) \times {\mathcal {O}}\times \varOmega \,, \end{aligned}$$

where p is the solution to the dual backward Eq. (45) and

$$\begin{aligned} P_U(v) = {\left\{ \begin{array}{ll} M &{} v \ge M\,,\\ v &{} |v| \le M\,\\ -M &{} v \le M\,.\\ \end{array}\right. } \end{aligned}$$
(42)

Remark 3.1

We would like to underline that in literature about stochastic control problem, the first order conditions of optimality (the Pontryagin maximum principle) are expressed in terms of dual stochastic backward equation, see, e.g., [12, 17]. Here, instead, optimality conditions are given in terms of a random backward dual equation which allows to simplify the setting also giving more insights on the derived optimal controller.

Proof

We provide the result exploiting the rescaling transformation \(y := e^{-W}X\), hence proving necessary condition for the problem (P2).

Let \((y^*,u^*)\) be an optimal pair for problem (P2), therefore we have that for any \(u \in {\mathcal {U}}\), defining \(u^\lambda := u^* + \lambda {\bar{u}} = u^* + \lambda (u - u^*)\), \(\lambda \ge 0\), by the optimality of \(u^*\) it must hold,

$$\begin{aligned} \frac{1}{\lambda } \left( \varPsi (u^\lambda ) - \varPsi (u^*)\right) \ge 0\,. \end{aligned}$$

By the Gâteaux differentiability of \(\varPsi \) it follows, taking the limit as \(\lambda \rightarrow 0\),

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\left[ \int _0^T \int _{{\mathcal {O}}} \left( e^W y^*(t,\xi ) -v_1(\xi )\right) z(t) + \alpha u^* {\bar{u}} d\xi dt \right] \\&\quad +\, {\mathbb {E}}\left[ \int _{{\mathcal {O}}} \left( e^W y^*(T,\xi ) -v_2(\xi )\right) z(T) d\xi \right] \ge 0\,, \end{aligned} \end{aligned}$$
(43)

being z the solution to the system in variation defined as

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}{\dot{z}}(t) - \varDelta \left( \gamma '(e^W y^{*}) z(t)\right) + G'(e^W y^{*})z(t) + \mu z(t) = e^{-W} u^* \,,\\ &{}\gamma '(e^W y^{*}) z(t) \in H^1_0({\mathcal {O}})\,,\, t \in (0,T)\\ &{} z(0)=0\,. \end{array}\right. } \end{aligned}$$
(44)

Therefore, introducing the backward dual system

$$\begin{aligned} \begin{aligned} {\dot{p}}(t)&= - \varDelta \gamma ' \left( e^W y^{*}\right) p - G'\left( e^W y^{*}\right) p_\lambda (t) + \mu p(t) + 2\left( e^W y^{*} - v_1\right) \,,\\ p(T)&= 2\left( y^*(T)-v_2\right) \,, \end{aligned} \end{aligned}$$
(45)

and exploiting Eqs. (44) and (45) together with Eq. (43), we have

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\left[ \int _0^T \int _{{\mathcal {O}}} \left\langle e^{-W}p(t) + \alpha u^*, {\bar{u}} \right\rangle dt \right] \ge 0\,, \end{aligned} \end{aligned}$$
(46)

which gives

$$\begin{aligned} u^*(t,\xi ) = \frac{1}{\alpha } P_U(-e^{-W}p(t,\xi ))\,,\quad \text{ a.e. } \text{ on } \quad (0,T) \times {\mathcal {O}}\times \varOmega \,, \end{aligned}$$

where \(P_U\) is the projection operator defined in (42). \(\square \)

Theorem 3.3

(The bang–bang principle) Let be \((v^*,u^*)\) an optimal pair for problem (P) and let \(\alpha = 0\), then it holds

$$\begin{aligned} u^*= {\left\{ \begin{array}{ll} -M &{} \text { if } p > 0\\ \in [-m,M] &{} \text { if } p = 0\\ M &{} \text { if } p <0 \,. \end{array}\right. } \end{aligned}$$
(47)

where p is the solution to the dual backward Eq. (45).

Proof

Proceeding as in Theorem 3.2 with obtain the equivalent of Eq. (46) to be

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\left[ \int _0^T \int _{{\mathcal {O}}} \left\langle e^{-W}p (t) , {\bar{u}} \right\rangle dt\right] \ge 0\,, \end{aligned} \end{aligned}$$
(48)

which yields Eq. (47), and the claim follows. \(\square \)

Remark 3.2

By (47) it follows that if \(\mid v^* -v_1 \mid >0 \) a.e. on \((0,T) \times {\mathcal {O}}\times \varOmega \) then the optimal controller \(u^*\) is a bang-bang controller, namely \(\mid u^* \mid =M\) a.e. on \((0,T) \times {\mathcal {O}}\times \varOmega \).