1 Introduction

Let us consider the nonlinear stochastic partial differential equation

$$\begin{aligned} du(t) - {\text {div}}\gamma (\nabla u(t))\,dt = B(t, u(t))\,dW(t), \quad u(0)=u_0, \end{aligned}$$
(1.1)

on \(L^2(D)\), where \(D \subset \mathbb {R}^n\) is a bounded domain with smooth boundary. Here \(\gamma \) is the gradient of a continuously differentiable convex function on \(\mathbb {R}^n\) growing faster than linearly at infinity, the divergence is interpreted in the usual variational sense, W is a cylindrical Wiener process, and B is a map with values in the space of Hilbert–Schmidt operators satisfying suitable Lipschitz continuity hypotheses. Precise assumptions on the data of the problem are given in Sect. 2 below.

Our main result is the well-posedness of (1.1), in the strong probabilistic sense, without any polynomial growth condition on \(\gamma \) nor any boundedness assumption on the noise (see Theorem 2.2 below). The lack of growth and coercivity assumptions on \(\gamma \) makes it impossible to apply the variational approach by Pardoux and Krylov–Rozovskiĭ (see [7, 12]), which is the only known general technique to solve nonlinear stochastic PDEs without linear terms in the drift such as (1.1), with the possible exception of viscosity solutions, a theory of which, however, does not seem to be available for such equations. On the other hand, we recall that, if \(\gamma \) is coercive and has polynomial growth, the results in op. cit. provide a fully satisfactory well-posedness result for (1.1).

The available literature dealing with stochastic equations in divergence form such as (1.1) is very limited and, to the best of our knowledge, entirely focused on the case where \(\gamma \) satisfies the above-mentioned coercivity and growth assumptions: see, e.g., [8] and the bibliography of [9] for results on the p-Laplace equation, which corresponds to the case \(\gamma (x)=|x|^{p-1}x\), and [13] on stochastic equations in divergence form with doubly nonlinear drift. The main novelty of this paper is thus to provide a satisfactory well-posedness result in the strong sense for such divergence-form equations under neither coercivity nor growth assumptions on \(\gamma \). On the other hand, it is worth recalling that well-posedness results are available for other classes of monotone SPDEs with nonlinearities satisfying no coercivity and growth conditions, most notably the stochastic porous media equation: see, e.g., [3]. However, the structure of divergence-form equations such as (1.1) is radically different. Indeed, as is well-known, the porous media operator is quasilinear, while the divergence-type operator in (1.1) is fully nonlinear. Moreover, the monotonicity properties (hence the dynamics associated to the the solutions) are different: the porous media operator is monotone in \(H^{-1}\), whereas the divergence-form operator is monotone in \(L^2\).

As is often the case in the treatment of evolution equations of monotone type, the first step consists in the regularization of (1.1), replacing \(\gamma \) with its Yosida approximation (a monotone Lipschitz-continuous function), thus obtaining a family of equations for which well-posedness is known to hold (in our case, we also need to add a “small” elliptic term in the drift as well as to smooth the diffusion coefficient B). In a second step, one proves that the solutions to the regularized equations are compact in suitable topologies, so that, by passage to the limit in the regularization parameters (roughly speaking), a process can be constructed that, in a final step, is shown to actually be the unique solution to (1.1) and to depend continuously on the initial datum. It is well known that the last two steps are the more challenging ones, and our problem is no exception.

The approach we follow combines elements of the variational method and ad hoc arguments, most notably a priori estimates on the solutions to regularized equations, weak compactness techniques, and a generalized version of Itô’s formula for the square of the norm under minimal integrability assumptions. A crucial role is played by a mix of pathwise and “averaged”Footnote 1 a priori estimates. Even though the approach is reminiscent of that in [11], the problem we consider here is of a completely different nature, and, correspondingly, new ideas are needed. In particular, the absence of a linear term in the drift precludes the possibility of applying a wealth of techniques available for semi-linear problems. For instance, the strong pathwise compactness criteria used in op. cit. are no longer available, so that we have to rely on weak compactness arguments only. This way one can construct a limit process, but its identification as a solution expectedly presents major new issues with respect to the case where stronger compactness is available. Moreover, a rather subtle measurability problem arises from the fact that the divergence is not injective, which is the reason for assuming \(\gamma \) to be a continuous monotone map, and not just a maximal monotone graph on \(\mathbb {R}^n \times \mathbb {R}^n\). A (less regular) solution to the more general problem when \(\gamma \) satisfies only the latter condition will appear elsewhere. We remark that the results obtained here hold under hypotheses that are as general as those of the deterministic theory, except for the continuity assumption on \(\gamma \) (see, e.g., [2, pp. 207–ff.]).

2 Main result

Given a positive real number T, let \((\Omega ,\mathscr {F},(\mathscr {F}_t)_{t\in [0,T]},\mathbb {P})\) be a filtered probability space, fixed throughout, satisfying the so-called “usual conditions”. We shall denote a cylindrical Wiener process on a separable Hilbert space H by W.

For any two Hilbert spaces U and V, the space of Hilbert–Schmidt operators from U to V will be denoted by \(\mathscr {L}^2(U,V)\). Let D be a smooth bounded domain of \(\mathbb {R}^n\), and assume that a map

$$\begin{aligned} B: \Omega \times [0,T] \times L^2(D) \longrightarrow \mathscr {L}^2(H,L^2(D)) \end{aligned}$$

is given such that, for a constant \(C>0\),

for all \(\omega \in \Omega \), \(t \in [0,T]\), \(x, y \in L^2(D)\). To avoid trivial situations, we also assume that, for an \(x_0 \in L^2(D)\), \(B(\omega ,t,x_0) < C\) for all \(\omega \) and t. This implies that B grows at most linearly in x, uniformly over \(\omega \) and t. Furthermore, the map \((\omega ,t) \mapsto B(\omega ,t,x)h\) is assumed to be measurable and adapted for all \(x \in L^2(D)\) and \(h \in H\).

We assume that \(\gamma \) is the subdifferential of a continuously differentiable convex function \(k:\mathbb {R}^n \rightarrow \mathbb {R}_+\) such that \(k(0)=0\),

$$\begin{aligned} \lim _{|x| \rightarrow \infty } \frac{k(x)}{|x|} = + \infty \end{aligned}$$

(i.e. k is superlinear at infinity), and

$$\begin{aligned} \limsup _{|x|\rightarrow \infty } \frac{k(-x)}{k(x)} < \infty . \end{aligned}$$

Then \(\gamma :\mathbb {R}^n \rightarrow \mathbb {R}^n\) is a continuous maximal monotone map, i.e.

$$\begin{aligned} \left( \gamma (x)-\gamma (y)\right) \cdot (x-y) \ge 0 \quad \forall x,y \in \mathbb {R}^n \end{aligned}$$

(the centered dot stands for the Euclidean scalar product in \(\mathbb {R}^n\)), and (the graph of) \(\gamma \) is maximal with respect to the order by inclusion. Moreover, the convex conjugate function \(k^*:\mathbb {R}^n \rightarrow \mathbb {R}_+\) of k, defined as

$$\begin{aligned} k^*(y) = \sup _{r\in \mathbb {R}^n} \left( y \cdot r - k(r) \right) , \end{aligned}$$

is itself convex and superlinear at infinity. For these facts of convex analysis, as well as those used in the sequel, we refer to, e.g., [6].

All assumptions on B and \(\gamma \) (hence also on k) are assumed to be in force from now on.

Definition 2.1

Let \(u_0\) be an \(L^2\)-valued \(\mathscr {F}_0\)-measurable random variable. A strong solution to Eq. (1.1) is a process \(u:\Omega \times [0,T] \rightarrow L^2(D)\) satisfying the following properties:

  1. (i)

    u is measurable, adapted and

    $$\begin{aligned} u \in L^1(0,T;W^{1,1}_0(D)) \end{aligned}$$
  2. (ii)

    \(B(\cdot ,u)h\) is measurable and adapted for all \(h \in H\) and

    $$\begin{aligned} B(\cdot ,u) \in L^2(0,T;\mathscr {L}^2(H,L^2(D))) \quad \mathbb {P}\text {-a.s.}; \end{aligned}$$
  3. (iii)

    \(\gamma (\nabla u)\) is an \(L^1(D)^n\)-valued measurable adapted process with

    $$\begin{aligned} \gamma (\nabla u) \in L^1(0,T;L^1(D)^n) \quad \mathbb {P}\text {-a.s.}; \end{aligned}$$
  4. (iv)

    one has, as an equality in \(L^2(D)\),

    $$\begin{aligned} u(t) - \int _0^t{{\text {div}}\gamma (\nabla u(s))\,ds} = u_0 + \int _0^t B(s,u(s))\,dW(s) \quad \mathbb {P}\text {-a.s.} \end{aligned}$$
    (2.1)

    for all \(t \in [0,T]\).

Since \(\gamma (\nabla u)\) is only assumed to take values in \(L^1(D)^n\), the second term on the left-hand side of (2.1) does not belong, a priori, to \(L^2(D)\). The identity (2.1) has to be interpreted to hold in the sense of distributions, so that the term containing \(\gamma (\nabla u)\) takes values in \(L^2(D)\) by difference. In fact, the conditions on B in (i) imply that the stochastic integral in (2.1) is an \(L^2(D)\)-valued local martingale.

Let \(\mathscr {K}\) be the set of measurable adapted processes \(\phi : \Omega \times [0,T] \rightarrow L^2(D)\) such that

Our main result is the following.

Theorem 2.2

Let \(u_0 \in L^2(\Omega ;L^2(D))\) be \(\mathscr {F}_0\)-measurable. Then (1.1) admits a strong solution u, which is unique within \(\mathscr {K}\). Moreover, u has weakly continuous paths in \(L^2(D)\) and the solution map \(u_0 \mapsto u\) is Lipschitz-continuous from \(L^2(\Omega ;L^2(D))\) to \(L^2(\Omega ;L^\infty (0,T;L^2(D)))\).

We do not know whether well-posedness continues to hold also without the condition that the solution belongs to \(\mathscr {K}\). This assumption, in fact, plays a crucial role in the proof of uniqueness.

Abbreviated notation for function spaces will be used from now on: Lebesgue and Sobolev spaces on D will be denoted without explicit mention of D itself; for any \(p \in [1,\infty ]\), \(L^p(\Omega )\) will be denoted by , \(L^p(0,T)\) by \(L^p_t\), and \(L^p(D)\) sometimes by \(L^p_x\). Mixed-norm spaces will be denoted just by juxtaposition, e.g. to mean \(L^p(\Omega ;L^q(0,T;L^r(D)))\) and \(L^1_{t,x}\) to mean \(L^1([0,T] \times D)\).

3 An Itô formula for the square of the norm

We prove an Itô formula for the square of the \(L^2\)-norm of a class of processes with minimal integrability conditions. This is an essential tool to prove uniqueness of strong solutions and their continuous dependence on the initial datum in Sects. 5 and 6 below, and it is interesting in its own right.

Proposition 3.1

Assume that

$$\begin{aligned} y(t) + \alpha \int _0^t y(s)\,ds - \int _0^t {\text {div}}\zeta (s) \,ds = y_0 + \int _0^t C(s)\,dW(s) \end{aligned}$$

holds in \(L^2\) for all \(t \in [0,T]\) \(\mathbb {P}\)-a.s., where \(\alpha \ge 0\) is a constant,

$$\begin{aligned} y:\Omega \times [0,T] \rightarrow L^2, \quad \zeta :\Omega \times [0,T] \rightarrow L^1, \quad C: \Omega \times [0,T] \rightarrow \mathscr {L}^2(H,L^2) \end{aligned}$$

are measurable adapted processes such that

and \(y_0\) is an \(\mathscr {F}_0\)-measurable \(L^2\)-valued random variable with \(\mathop {{}\mathbb {E}}||y_0 ||^2<\infty \). If there exists a constant \(c>0\) such that

$$\begin{aligned} \mathop {{}\mathbb {E}}\int _0^T\!\!\int _D \left( k(c\nabla y) + k^*(c\zeta )\right) < \infty , \end{aligned}$$

then

for all \(t\in [0,T]\) \(\mathbb {P}\)-almost surely.

Proof

Note that \({\text {div}}\zeta \in (W^{1,\infty }_0)'\), hence, by Sobolev embedding theorems and duality, there exists a positive integer r such that \({\text {div}}\zeta \in H^{-r}\). Therefore, denoting the Dirichlet Laplacian on \(L^2(D)\) by \(\Delta \), there also exists a positive integer m such that \((I-\delta \Delta )^{-m}\), \(\delta >0\), maps \(H^{-r}\) and (a fortiori) \(L^2\) to \(H^1_0 \cap W^{1,\infty }\). Using the notation \(h^\delta :=(I-\delta \Delta )^{-m}h\), it is readily seen that

$$\begin{aligned} y^\delta (t)+\alpha \int _0^ty^\delta (s)\,ds-\int _0^t{\text {div}}\zeta ^\delta (s)\,ds= y^\delta _0+\int _0^tT^\delta (s)\,dW(s) \end{aligned}$$

for all \(t\in [0,T]\) \(\mathbb {P}\)-a.s. as an identity in \(L^2\), for which Itô’s formula yields

for all \(t \in [0,T]\) \(\mathbb {P}\)-almost surely. We are going to pass to the limit as \(\delta \rightarrow 0\) in this identity. The dominated convergence theorem immediately implies that, \(\mathbb {P}\)-a.s.,

for all \(t \in [0,T]\), and \(||y^\delta _0 ||^2 \rightarrow ||y_0 ||^2\), as \(\delta \rightarrow 0\). Defining the real local martingales

$$\begin{aligned} M^{\delta } := (y^{\delta } C^{\delta }) \cdot W, \quad M := (yC) \cdot W, \end{aligned}$$

we are going to show that

as \(\delta \rightarrow 0\). In fact, Davis’ inequality for local martingales (see, e.g., [10]) yields

and one has, identifying \(\mathscr {L}^2(H,\mathbb {R})\) with H and recalling that \((I-\delta \Delta )^{-m}\) is contractive in \(L^2\),

so that

It follows by the Cauchy–Schwarz inequality that the first term on the right-hand side is dominated by

which converges to zero by properties of Hilbert–Schmidt operators and the dominated convergence theorem. Moreover,

and \(y \in L^\infty _t L^2_x\), \(C \in L^2_t\mathscr {L}(H,L^2_x)\) \(\mathbb {P}\)-a.s. imply, by dominated convergence, that

\(\mathbb {P}\)-a.s. as \(\delta \rightarrow 0\). Since

and, by the Cauchy–Schwarz inequality,

again by dominated convergence it follows that

as \(\delta \rightarrow 0\). We have thus shown that as \(\delta \rightarrow 0\), hence, in particular, that

$$\begin{aligned} \int _0^t y^\delta (s)C^\delta (s)\,dW(s) \longrightarrow \int _0^t y(s)C(s)\,dW(s) \end{aligned}$$

in probability as \(\delta \rightarrow 0\) for all \(t \in [0,T]\).

To complete the proof, we are going to show that \(\nabla Y^{\delta }\cdot \zeta ^{\delta } \rightarrow \nabla Y\cdot \zeta \) in , which readily implies that

$$\begin{aligned} \int _0^t\!\!\int _D \nabla y^{\delta }(s,x) \cdot \zeta ^{\delta }(s,x) \,dx\,ds \longrightarrow \int _0^t\!\!\int _D \nabla y(s,x) \cdot \zeta (s,x) \,dx\,ds \end{aligned}$$

in probability for all \(t \in [0,T]\). Since \(\nabla y^{\delta } \rightarrow \nabla y\) and \(\zeta ^{\delta } \rightarrow \zeta \) in measure in \(\Omega \times (0,T) \times D\), in view of Vitali’s theorem, it suffices to prove that the sequence \((\nabla y^{\delta }\cdot \zeta ^{\delta })\) is uniformly integrable in \(\Omega \times (0,T) \times D\). One has

$$\begin{aligned} c^2 \left( \nabla y^\delta \cdot \zeta ^\delta \right)&\le k\left( c\nabla y^\delta \right) + k^*\left( c\zeta ^\delta \right) ,\\ -\,c^2 \left( \nabla y^\delta \cdot \zeta ^\delta \right)&\le k\left( c(-\nabla y^\delta ) \right) + k^*\left( c\zeta ^\delta \right) \end{aligned}$$

hence

where the second inequality follows by the hypothesis \(\limsup _{|x|\rightarrow \infty } k(-x)/k(x)<\infty \). By Jensen’s inequality for sub-Markovian operators (see [5, Theorem 3.4]) we also have

$$\begin{aligned} k\left( c\nabla y^\delta \right)&= k\left( (I-\delta \Delta )^{-m} c\nabla y \right) \le (I-\delta \Delta )^{-m} k\left( c\nabla y \right) ,\\ k^*\left( c\zeta ^\delta \right)&=k^*\left( (I-\delta \Delta )^{-m} c\zeta \right) \le (I-\delta \Delta )^{-m} k^*\left( c\zeta \right) , \end{aligned}$$

hence

where the right-hand side is uniformly integrable because it converges in as \(\delta \rightarrow 0\). This yields that \((\nabla y^\delta \cdot \zeta ^\delta )\) is uniformly integrable as well, thus concluding the proof. \(\square \)

4 Well-posedness for an auxiliary SPDE

Let \(V_0\) be a separable Hilbert space, densely and continuously embeddedFootnote 2 in \(H^1_0\), and continuously embedded in \(W^{1,\infty }\). The Sobolev embedding theorem easily implies that such a space exists indeed.

We are going to prove that the auxiliary equation

$$\begin{aligned} du(t) - {\text {div}}\gamma (\nabla u(t))\,dt = G(t)\,dW(t), \quad u(0)=u_0, \end{aligned}$$
(4.1)

where G is an \(\mathscr {L}^2(U,V_0)\)-valued process, is well posed.

Proposition 4.1

Assume that is \(\mathscr {F}_0\)-measurable and that \(G:\Omega \times [0,T] \rightarrow \mathscr {L}^2(U,V_0)\) is measurable and adapted, with

Then Eq. (4.1) admits a unique strong solution u such that

Moreover, the paths of u are \(\mathbb {P}\)-a.s. weakly continuous with values in \(L^2\).

The assumptions of Proposition 4.1 are (tacitly) assumed to hold throughout the section.

Let \(\gamma _\lambda : \mathbb {R}^n \rightarrow \mathbb {R}^n\), \(\lambda >0\), be the Yosida regularization of \(\gamma \), i.e.

$$\begin{aligned} \gamma _\lambda := \frac{1}{\lambda }\left( I - (I+\lambda \gamma )^{-1} \right) , \quad \lambda >0, \end{aligned}$$

and consider the regularized equation

$$\begin{aligned} du_\lambda (t) - {\text {div}}\gamma _\lambda (\nabla u_\lambda (t))\,dt - \lambda \Delta u_\lambda (t)\,dt = G(t)\,dW(t), \quad u_\lambda (0)=u_0. \end{aligned}$$

Since \(\gamma _\lambda \) is monotone and Lipschitz-continuous, it is not difficult to check that the operator

$$\begin{aligned} v \longmapsto - \left( {\text {div}}\gamma _\lambda (\nabla v) + \lambda \Delta v \right) \end{aligned}$$

satisfies the conditions of the classical variational approach by Pardoux, Krylov and Rozovskiĭ [7, 12] on the Gelfand triple \(H^1_0 \hookrightarrow L^2 \hookrightarrow H^{-1}\), hence there exists a unique adapted process \(u_\lambda \) with values in \(H^1_0\) such that

and

$$\begin{aligned} u_\lambda (t) - \int _0^t {\text {div}}\gamma _\lambda (\nabla u_\lambda (s))\,ds - \lambda \int _0^t \Delta u_\lambda (s)\,ds = u_0 + \int _0^t G(s)\,dW(s) \end{aligned}$$
(4.2)

in \(H^{-1}\) for all \(t \in [0,T]\).

4.1 A priori estimates

We are now going to establish several a priori estimates for \(u_\lambda \) and related processes, both pathwise and in expectation.

We begin with a simple maximal estimate for stochastic integrals that will be used several times in the sequel.

Lemma 4.2

Let U, H, K be separable Hilbert spaces. If

$$\begin{aligned} F: \Omega \times [0,T] \rightarrow \mathscr {L}(H,K), \quad G: \Omega \times [0,T] \rightarrow \mathscr {L}^2(U,H) \end{aligned}$$

are measurable and adapted processes such that

then, for any \(\varepsilon >0\),

Proof

By the ideal property of Hilbert–Schmidt operators (see, e.g., [4, p. V.52]), one has

for all \(s \in [0,T]\), hence

where the right-hand side is finite \(\mathbb {P}\)-a.s. thanks to the assumptions on F and G. Then \((FG) \cdot W\) is a K-valued local martingale, for which Davis’ inequality yields

The proof is finished invoking the elementary inequality

$$\begin{aligned} ab \le \frac{1}{2} \left( \varepsilon a^2 + \frac{1}{\varepsilon } b^2\right) \quad \forall a, b \in \mathbb {R}, \; \varepsilon >0, \end{aligned}$$

and choosing \(\varepsilon \) properly. \(\square \)

The estimate in the previous lemma will be used only in the case \(K=\mathbb {R}\). The more general proof we have given is not more complicated than in the simpler case actually needed.

Lemma 4.3

There exists a constant N such that

Proof

Itô’s formula yields

where \(u_\lambda \) in the stochastic integral on the right-hand side has to be interpreted as taking values in \(\mathscr {L}(L^2,R) \simeq L^2\). Taking supremum in time and expectation we get

where, by Lemma 4.2,

for any \(\varepsilon >0\). The proof is completed choosing \(\varepsilon \) small enough and recalling that \(\gamma _\lambda \) is monotone. \(\square \)

Lemma 4.4

The families \((\nabla u_\lambda )\) and \((\gamma _\lambda (\nabla u_\lambda ))\) are relatively weakly compact in .

Proof

Recall that, for any y, \(r \in \mathbb {R}^n\), ones has \(k(y)+k^*(r)=r\cdot y\) if and only if \(r \in \partial k(y)=\gamma (y)\). Therefore, since

$$\begin{aligned} \gamma _\lambda (x) \in \partial k\left( (I+\lambda \gamma )^{-1}x\right) = \gamma \left( (I+\lambda \gamma )^{-1}x\right) \quad \forall x \in \mathbb {R}^n, \end{aligned}$$

we deduce, by the definition of \(\gamma _\lambda \), that

(4.3)

By Lemma 4.3 we infer that there exists a constant N, independent of \(\lambda \), such that

$$\begin{aligned} \mathop {{}\mathbb {E}}\int _0^T\!\!\int _D k^*\left( \gamma _\lambda (\nabla u_\lambda )\right) \le \mathop {{}\mathbb {E}}\int _0^T\!\!\int _D \gamma _\lambda (\nabla u_\lambda )\cdot \nabla u_\lambda < N. \end{aligned}$$

Since \(k^*\) is superlinear at infinity, the family \((\gamma _\lambda (\nabla u_\lambda ))\) is uniformly integrable on \(\Omega \times (0,T) \times D\) by the de la Vallée Poussin criterion (see the “Appendix”), hence relatively weakly compact in by a well-known theorem of Dunford and Pettis.

Similarly, Lemma 4.3 and (4.3) imply that there exists a constant N, independent of \(\lambda \), such that

$$\begin{aligned} \mathop {{}\mathbb {E}}\int _0^T\!\!\int _D k\left( (I+\lambda \gamma )^{-1}\nabla u_\lambda \right) \le \mathop {{}\mathbb {E}}\int _0^T\!\!\int _D \gamma _\lambda (\nabla u_\lambda ) \cdot \nabla u_\lambda < N. \end{aligned}$$

Since k is superlinear at infinity, the criteria by de la Vallée Poussin and Dunford-Pettis imply that the sequence \((I+\lambda \gamma )^{-1}\nabla u_\lambda \) is uniformly integrable on \(\Omega \times (0,T) \times D\), hence relatively weakly compact in . Moreover, since

$$\begin{aligned} \nabla u_\lambda =(I+\lambda \gamma )^{-1}\nabla u_\lambda +\lambda \gamma _\lambda (\nabla u_\lambda ), \end{aligned}$$

the relative weak compactness of \((\nabla u_\lambda )\) immediately follows by the same property of \((\gamma _\lambda (\nabla u_\lambda ))\) proved above. \(\square \)

We shall need below the following classical absolute continuity result, whose proof can be found, for instance, in [2, p. 25].

Lemma 4.5

Let V and H be Hilbert spaces with \(V \hookrightarrow H \hookrightarrow V'\). Assume that \(u \in L^2(a,b;V)\) and \(u' \in L^2(a,b;V')\), where \(u'\) is the derivative of u in the sense of \(V'\)-valued distributions. Then there exists \(\tilde{u} \in C([a,b];H)\) such that \(u(t)=\tilde{u}(t)\) for almost all \(t \in [a,b]\). Moreover, for any v satisfying the same hypotheses of u, \(\langle u,v \rangle \) is absolutely continuous on [ab] and

As customary, both the duality pairing between V and \(V'\) as well as the scalar product of H have been denoted by the same symbol.

From now on we shall assume, without loss of generality, that \(\lambda \in \mathopen ]0,1\mathclose ]\).

Lemma 4.6

There exists \(\Omega ' \subseteq \Omega \) with \(\mathbb {P}(\Omega ')=1\) and \(M:\Omega ' \rightarrow \mathbb {R}\) such that

for all \(\omega \in \Omega '\).

Proof

Setting \(v_\lambda := u_\lambda - G \cdot W\), Eq. (4.2) can be written as

$$\begin{aligned} v_\lambda (t) - \int _0^t {\text {div}}\left( \gamma _\lambda (\nabla u_\lambda (s)) + \lambda \nabla u_\lambda (s)\right) \,ds = u_0, \end{aligned}$$

or, equivalently, as

$$\begin{aligned} v'_\lambda - {\text {div}}\left( \gamma _\lambda (\nabla u_\lambda ) + \lambda \nabla u_\lambda \right) = 0, \quad v_\lambda (0) = u_0. \end{aligned}$$
(4.4)

By Itô’s isometry and Doob’s inequality, one has

hence , because \(V_0 \hookrightarrow H^1_0\). In particular, since , it follows that . Moreover, since \({\text {div}}\gamma _\lambda (\nabla u_\lambda )\) and \(\Delta u_\lambda \) belong to , by the previous identity we also deduce that \(v'_\lambda (\omega ) \in L^2_t H^{-1}\) for \(\mathbb {P}\)-a.a. \(\omega \in \Omega \). In particular, taking into account the hypotheses on \(u_0\) and G, there exists \(\Omega ' \subset \Omega \), with \(\mathbb {P}(\Omega ')=1\), such that

$$\begin{aligned}&u_0(\omega ) \in L^2_x, \quad G \cdot W(\omega ,\cdot ) \in L^\infty _t V_0,\\&v_\lambda (\omega ) \in L^2_t H^1_0, \quad v'_\lambda (\omega ) \in L^2_t H^{-1} \end{aligned}$$

for all \(\omega \in \Omega '\). Let us consider from now on a fixed but arbitrary \(\omega \in \Omega '\). Taking the duality pairing of (4.4) by \(v_\lambda \) and integrating (more precisely, applying Lemma 4.5) implies that, for all \(t \in [0,T]\),

$$\begin{aligned}&\frac{1}{2}||v_\lambda (t) ||^2 + \int _0^t\!\!\int _D \gamma _\lambda (\nabla u_\lambda (s)) \cdot \nabla v_\lambda (s)\,dx\,ds\\&\quad + \lambda \int _0^t\!\!\int _D \nabla u_\lambda (s) \cdot \nabla v_\lambda (s)\,dx\,ds = \frac{1}{2}||u_0 ||^2, \end{aligned}$$

where \(||u_\lambda || \le ||v_\lambda || + ||G\cdot W ||\), hence \(||u_\lambda ||^2 \le 2\left( ||v_\lambda ||^2 + ||G\cdot W ||^2\right) \), as well as

$$\begin{aligned} ||v_\lambda ||^2 \ge \frac{1}{2} ||u_\lambda ||^2 - ||G \cdot W ||^2. \end{aligned}$$

Moreover, Young’s inequality yields

hence also, taking into account the previous estimate,

(4.5)

Let \(k_\lambda \) be the Moreau–Yosida regularization of k, i.e.

$$\begin{aligned} k_\lambda (x) := \inf _{y \in \mathbb {R}^n} \left( k(y) + \frac{|x-y|^2}{2\lambda } \right) , \quad \lambda >0. \end{aligned}$$

As is well known, \(k_\lambda \) is a proper convex function that converges pointwise to k from below, and \(\partial k_\lambda = \gamma _\lambda \). Therefore, it follows from

$$\begin{aligned} \gamma _\lambda (x)\cdot (x-y) \ge k_\lambda (x)-k_\lambda (y) \ge k_\lambda (x)-k(y) \quad \forall x,y \in \mathbb {R}^n \end{aligned}$$

that

$$\begin{aligned}&\int _0^t\!\!\int _D \gamma _\lambda (\nabla u_\lambda (s)) \cdot \nabla v_\lambda (s)\,dx\,ds\\&\quad = \int _0^t\!\!\int _D \gamma _\lambda (\nabla u_\lambda (s,x))(\nabla u_\lambda (s,x) - \nabla (G \cdot W(s,x)))\,dx\,ds\\&\quad \ge \int _0^t\!\!\int _D k_\lambda (\nabla u_\lambda (s,x))\,dx\,ds - \int _0^t\!\!\int _D k(\nabla (G\cdot W(s,x)))\,dx\,ds, \end{aligned}$$

hence also

Taking the supremum with respect to t yields

As already observed above, the first three terms on the right-hand side are clearly finite. Moreover, since \(V_0 \hookrightarrow W^{1,\infty }\), one has

by the continuity of k. Since \(\omega \) was chosen arbitrarily in \(\Omega '\), the proof is completed. \(\square \)

Lemma 4.7

There exists a set \(\Omega '\), with \(\mathbb {P}(\Omega ')=1\), such that, for all \(\omega \in \Omega '\), the families \((\gamma _\lambda (\nabla u_\lambda ))\) and \((\nabla u_\lambda )\) are relatively weakly compact in \(L^1_{t,x}\).

Proof

Let \(\Omega '\) be defined as in the proof of Lemma 4.6, and fix an arbitrary \(\omega \in \Omega '\). By (4.5), since \(v_\lambda =u_\lambda - G \cdot W\), it follows that

$$\begin{aligned}&\int _0^t\!\!\int _D\gamma _\lambda (\nabla u_\lambda (s)) \cdot \nabla u_\lambda (s)\,dx\,ds\\&\quad \le \frac{1}{2} ||u_0 ||^2 + \frac{1}{2} ||G\cdot W(t) ||^2 + \frac{1}{2} \int _0^t ||G\cdot W(s) ||_{H^1_0}^2\,ds\\&\qquad + \int _0^t\!\!\int _D \gamma _\lambda (\nabla u_\lambda (s))\cdot \nabla (G\cdot W(s))\,dx\,ds \end{aligned}$$

for all \(t \le T\). Thanks to Young’s inequality, convexity of \(k^*\), and \(k^*(0)=0\), one has

$$\begin{aligned} \gamma _\lambda (\nabla u_\lambda )\cdot \nabla (G\cdot W)&= \frac{1}{2} \gamma _\lambda (\nabla u_\lambda ) \cdot 2\nabla (G\cdot W)\\&\le \frac{1}{2} k^*\left( \gamma _\lambda (\nabla u_\lambda )\right) + k(2\nabla (G \cdot W)). \end{aligned}$$

Recalling that \(k^*(\gamma _\lambda (x)) \le \gamma _\lambda (x) \cdot x\) for all \(x \in \mathbb {R}^n\), rearranging terms one gets

$$\begin{aligned} \int _0^T\!\!\int _D k^*(\nabla u_\lambda (s)) \,dx\,ds&\lesssim ||u_0 ||^2 + ||G\cdot W(T) ||^2 + \int _0^T ||G\cdot W(t) ||_{H^1_0}^2\,ds\\&\quad + \int _0^T\!\!\int _D k\left( 2\nabla (G\cdot W(s))\right) \,dx\,ds, \end{aligned}$$

where all terms on the right-hand side are finite, as already established in the proof of Lemma 4.6. Appealing again to the criteria by de la Vallée Poussin and Dunford-Pettis, we immediately infer that \((\gamma _\lambda (\nabla u_\lambda (\omega ,\cdot )))\) is relatively weakly compact in \(L^1_{t,x}\).

Denoting by M (a constant depending on \(\omega \)) the right-hand side of the previous inequality, the above estimates also yield

hence also, recalling that \(k((I+\lambda \gamma )^{-1}x) \le \gamma _\lambda (x) \cdot x\),

This implies, in complete analogy to the previous case, that \(\left( (I+\lambda \gamma )^{-1}\nabla u_\lambda \right) \) is relatively weakly compact in \(L^1_{t,x}\). Since

$$\begin{aligned} \nabla u_\lambda = \lambda \gamma _\lambda (\nabla u_\lambda ) + (I+\lambda \gamma )^{-1}\nabla u_\lambda , \end{aligned}$$

the relative weak compactness of \((\nabla u_\lambda (\omega ,\cdot ))\) in \(L^1_{t,x}\) follows immediately. \(\square \)

4.2 Proof of Proposition 4.1

Let \(\omega \in \Omega '\) be arbitrary but fixed, where \(\Omega '\) is a subset of \(\Omega \) with probability one, chosen as in the proof of Lemma 4.6. The relative weak compactness of \((\gamma _\lambda (\nabla u_\lambda ))\) in \(L^1_{t,x}\), proved in Lemma 4.7, implies that there exists \(\eta \in L^1_{t,x}\) such that \(\gamma _{\mu }(\nabla u_{\mu }) \rightarrow \eta \) weakly in \(L^1_{t,x}\), where \(\mu \) is a subsequence of \(\lambda \). This in turn implies that

$$\begin{aligned} \int _0^t {\text {div}}\gamma _{\mu }(\nabla u_{\mu }(s))\,ds \longrightarrow \int _0^t {\text {div}}\eta (s)\,ds \quad \text {weakly in } V_0' \end{aligned}$$

for all \(t \in [0,T]\). In fact, for any \(\phi _0 \in V_0\), setting \(\phi :=s \mapsto 1_{[0,t]}(s) \phi _0 \in L^\infty _t V_0\), recalling that \(V_0\hookrightarrow W^{1,\infty }\), we have

as \(\mu \rightarrow 0\). Moreover, \(\sqrt{\lambda } u_\lambda \) is bounded in \(L^2_t H^1_0\) thanks to Lemma 4.6, hence, recalling that \(\Delta \) is an isomorphism of \(H^1_0\) and \(H^{-1}\), \(\lambda \Delta u_\lambda \rightarrow 0\) in \(L^2_t H^{-1}\) as \(\lambda \rightarrow 0\), in particular

$$\begin{aligned} \lambda \int _0^t \Delta u_\lambda (s)\,ds \longrightarrow 0 \quad \text {in } H^{-1} \end{aligned}$$

for all \(t \in [0,T]\) as \(\lambda \rightarrow 0\). Therefore, considering the regularized equation

$$\begin{aligned} u_\mu (t) - \int _0^t {\text {div}}\gamma _\mu (\nabla u_\mu (s))\,ds - \mu \int _0^t \Delta u_\mu (s)\,ds = u_0 + G \cdot W(t) \end{aligned}$$

and passing to the limit as \(\mu \rightarrow 0\), we infer that \(u_\mu (t) \rightarrow u(t)\) weakly in \(V_0'\) for all \(t \in [0,T]\), hence one can write

$$\begin{aligned} u(t) - \int _0^t {\text {div}}\eta (s)\,ds = u_0 + G \cdot W(t) \quad \text {in }V_0' \end{aligned}$$
(4.6)

for all \(t \in [0,T]\). Since \({\text {div}}\eta \in L^1_t V_0'\) and \(G \cdot W \in L^\infty _t V_0\), it immediately follows that \(u \in C_t V_0'\). Moreover, since, thanks to Lemma 4.6, \((u_\mu (t))\) is bounded in \(L^2\), we also have \(u_\mu (t) \rightarrow u(t)\) weakly in \(L^2\). In fact, let \(\varepsilon >0\) and \(\psi \in L^2\) be arbitrary. Since \(V_0\) is dense in \(L^2\), there exists \(\phi \in V_0\) with , and one can write

where the second term on the right-hand side converges to zero as \(\mu ,\,\nu \rightarrow 0\), and

so that, recalling that Hilbert spaces are weakly sequentially complete, \(u_\mu (t)\) converges weakly in \(L^2\), necessarily to u(t), for all \(t \in [0,T]\). This also immediately implies that \(u \in L^\infty _t L^2_x\). From this, together with \(u \in C_t V_0'\), it follows in turn that \(u \in C_w([0,T];L^2)\) by a criterion due to Strauss (see [14, Theorem 2.1]—here and below \(C_w([0,T];E)\) stands for the space of space of weakly continuous functions from [0, T] to a Banach space E). Furthermore, since all terms in (4.6) except the second one on the left-hand side take values in \(L^2\), it follows that (4.6) is satisfied also as an identity in \(L^2\).

Let us show that \(u \in L^1_t W^{1,1}_0\): the relative weak compactness of \((\nabla u_\lambda )\) in \(L^1_{t,x}\), proved in Lemma 4.7, implies that there exists \(v \in L^1_{t,x}\) such that, along a subsequence of \(\lambda \) which can be assumed to coincide with \(\mu \), \(\nabla u_{\mu } \rightarrow v\) weakly in \(L^1_{t,x}\). Taking into account that \(u_{\mu } \in H^1_0\) for all \(\mu \) and that \(u_\mu \rightarrow u\) weakly* in \(L^\infty _tL^2_x\), it easily follows that \(v=\nabla u\) a.e. in \([0,T] \times D\) and that \(u \in L^1_t W^{1,1}_0\).

As a next step, we are going to show that \(\eta = \gamma (\nabla u)\) a.e. in \((0,T) \times D\). For this we shall need the “energy” identity proved in the following lemma.

Lemma 4.8

Assume that

$$\begin{aligned} y(t) - \int _0^t {\text {div}}\zeta (s)\,ds = y_0 + f(t) \quad \text {in } L^2 \quad \forall t\in [0,T], \end{aligned}$$

where \(y_0 \in L^2_x\), \(y \in L^\infty _tL^2_x \cap L^1_tW^{1,1}_0\), \(\zeta \in L^1_{t,x}\), and \(f \in L^2_tV_0\) with \(f(0)=0\). Furthermore, assume that there exists \(c>0\) such that

$$\begin{aligned} k(c\nabla y) + k^*(c\zeta ) \in L^1_{t,x}. \end{aligned}$$

Then

Proof

The proof if analogous to that of Proposition 3.1, of which we borrow the notation and the setup. In particular, let \(m \in \mathbb {N}\) be such that

$$\begin{aligned} y^\delta (t) - \int _0^t {\text {div}}\zeta ^\delta (s)\,ds = y^\delta _0 + f^\delta (t) \quad \text {in } L^2 \quad \forall t\in [0,T], \end{aligned}$$

hence, by Lemma 4.5,

where, as \(\delta \rightarrow 0\), for all \(t \in ]0,T]\) and . Moreover, since \(y^\delta - f^\delta \rightarrow y - f\) in \(L^1_tW^{1,1}_0\) and \(\zeta ^\delta \rightarrow \zeta \) in \(L^1_{t,x}\), we have that, up to selecting a subsequence,

$$\begin{aligned} \zeta ^\delta \cdot \nabla \left( y^\delta - f^\delta \right) \longrightarrow \zeta \cdot \nabla \left( y - f\right) \end{aligned}$$

almost everywhere in \([0,T] \times D\). Therefore, taking Vitali’s theorem into account, the lemma is proved if we show that \(\zeta ^\delta \cdot \nabla (y^\delta - f^\delta )\) is uniformly integrable: one has, by Young’s inequality and convexity,

$$\begin{aligned} \frac{c^2}{2} \zeta ^\delta \cdot \nabla (y^\delta - f^\delta )&\le k\left( c/2 (\nabla y^\delta - \nabla f^\delta ) \right) + k^*\left( c\zeta ^\delta \right) \\&\le \frac{1}{2} k\left( c\nabla y^\delta \right) + \frac{1}{2} k\left( c(-\nabla f^\delta ) \right) + k^*\left( c\zeta ^\delta \right) , \end{aligned}$$

as well as

$$\begin{aligned} - \frac{c^2}{2} \zeta ^\delta \cdot \nabla (y^\delta - f^\delta )&\le k\left( c/2 (-\nabla y^\delta + \nabla f^\delta ) \right) + k^*\left( c\zeta ^\delta \right) \\&\le \frac{1}{2} k\left( c(-\nabla y^\delta ) \right) + \frac{1}{2} k\left( c\nabla f^\delta \right) + k^*\left( c\zeta ^\delta \right) , \end{aligned}$$

hence

It follows by Jensen’s inequality for sub-Markovian operators, recalling that \((I-\delta \Delta )^{-m}\) and \(\nabla \) commute, that

where \(k(c\nabla y)\) and \(k^*(c\zeta )\) belong to \(L^1_{t,x}\) by assumption, and the same holds for \(k(c\nabla f) + k(c(-\nabla f))\) because \(f \in W^{1,\infty }\). Moreover, the hypothesis \(\limsup _{|x| \rightarrow \infty } k(-x)/k(x)<\infty \) implies that

$$\begin{aligned} \int _0^T\!\!\int _D k(c(-\nabla y)) \lesssim 1 + \int _0^T\!\!\int _D k(\nabla y) < \infty , \end{aligned}$$

therefore, taking into account that \((I-\delta \Delta )^{-m}\) is a contraction in \(L^1\), we obtain that \(c^2|\zeta ^\delta \cdot \nabla (y^\delta - f^\delta )|\) is dominated by a sequence that converges in \(L^1_{t,x}\), which immediately implies that \(\zeta ^\delta \cdot \nabla (y^\delta - f^\delta )\) is uniformly integrable in \([0,T] \times D\). \(\square \)

As in the proof of Lemma 4.6, it follows from (4.4) and Lemma 4.5 that

$$\begin{aligned}&\frac{1}{2}||v_\lambda (t) ||^2 + \int _0^t\!\!\int _D \gamma _\lambda (\nabla u_\lambda (s)) \cdot \nabla v_\lambda (s)\,dx\,ds\\&\quad + \lambda \int _0^t\!\!\int _D \nabla u_\lambda (s) \cdot \nabla v_\lambda (s)\,dx\,ds = \frac{1}{2}||u_0 ||^2 \end{aligned}$$

for all \(t \in [0,T]\), where \(v_\lambda = u_\lambda - G \cdot W\). This immediately implies

$$\begin{aligned} \begin{aligned}&\frac{1}{2}||v_\lambda (t) ||^2 + \int _0^t\!\!\int _D \gamma _\lambda (\nabla u_\lambda (s)) \cdot \nabla u_\lambda (s)\,dx\,ds\\&\quad \le \frac{1}{2}||u_0 ||^2 + \int _0^t\!\!\int _D \gamma _\lambda (\nabla u_\lambda (s)) \cdot \nabla (G \cdot W(s))\,dx\,ds\\&\qquad + \lambda \int _0^t\!\!\int _D \nabla u_\lambda (s) \cdot \nabla (G \cdot W(s))\,dx\,ds, \end{aligned} \end{aligned}$$
(4.7)

where

by the weak lower semicontinuity of the norm and the weak convergence of \(u_\mu (t)\) to u(t) in \(L^2\). Moreover, recalling that \(\gamma _\mu (\nabla u_\mu ) \rightarrow \eta \) weakly in \(L^1_{t,x}\) and \(\nabla (G\cdot W) \in L^\infty _{t,x}\), as \(V_0 \hookrightarrow W^{1,\infty }\), we have

$$\begin{aligned} \int _0^t\!\!\int _D \gamma _\mu (\nabla u_\mu (s)) \cdot \nabla (G \cdot W(s))\,dx\,ds \longrightarrow \int _0^t\!\!\int _D \eta (s) \cdot \nabla (G \cdot W(s))\,dx\,ds. \end{aligned}$$

The last term on the right-hand side of (4.7) converges to zero as \(\mu \rightarrow 0\) because \((\nabla u_\mu )\) is bounded in \(L^1_{t,x}\) and \(\nabla (G\cdot W) \in L^\infty _{t,x}\). We have thus obtained

By Lemma 4.8 we have

which implies that

$$\begin{aligned} \limsup _{\mu \rightarrow 0} \int _0^T\!\!\int _D \gamma _{\mu }(\nabla u_{\mu }) \cdot \nabla u_{\mu } \,dx\,ds \le \int _0^T\!\!\int _D \eta \cdot \nabla u \,dx\,ds. \end{aligned}$$

Moreover, since

$$\begin{aligned} \gamma _\mu (x)\cdot (I+\mu \gamma )^{-1}x= \gamma _\mu (x) \cdot x - \mu |\gamma _\mu (x)|^2 \le \gamma _\mu (x)\cdot x \end{aligned}$$

for all \(x \in \mathbb {R}^n\), we obtain

$$\begin{aligned} \limsup _{\mu \rightarrow 0} \int _0^T\!\!\int _D \gamma _{\mu }(\nabla u_\mu ) \cdot (I+\mu \gamma )^{-1} \nabla u_{\mu } \,dx\,ds \le \int _0^T\!\!\int _D \eta \cdot \nabla u\,dx\,ds, \end{aligned}$$

where \((I+\mu \gamma )^{-1}\nabla u_{\mu } \rightarrow \nabla u\) and \(\gamma _{\mu }(\nabla u_{\mu }) \rightarrow \eta \) weakly in \(L^1_{t,x}\). In particular, the weak lower semicontinuity of convex integrals yields

$$\begin{aligned}&\int _0^T\!\!\int _D \left( k(\nabla u)+k^*(\eta ) \right) \\&\quad \le \liminf _{\mu \rightarrow 0} \int _0^T\!\!\int _D \left( k((I+\mu \gamma )^{-1} \nabla u_\mu ) + k^*(\gamma _\mu (\nabla u_\mu )) \right) \,dx\,dt\\&\quad = \liminf _{\mu \rightarrow 0} \int _0^T\!\!\int _D \gamma _\mu (\nabla u_\mu )\cdot (I+\mu \gamma )^{-1} \nabla u_\mu \,dx\,dt < N, \end{aligned}$$

where \(N=N(\omega )\) is a constant. Recalling that \(\gamma _{\mu } \in \gamma ((I+\mu \gamma )^{-1})\) and \(\gamma =\partial k\), we have

$$\begin{aligned} k((I+\mu \gamma )^{-1}\nabla u_{\mu })+ \gamma _{\mu }(\nabla u_{\mu })\cdot (z-(I+\mu \gamma )^{-1}\nabla u_{\mu }) \le k(z) \quad \forall z\in \mathbb {R}^n. \end{aligned}$$

From this it follows, again by the weak lower semicontinuity of convex integrals, that

$$\begin{aligned} \int _0^T\!\!\int _D k(\nabla u) + \int _0^T\!\!\int _D \eta \cdot (\zeta - \nabla u) \le \int _0^T\!\!\int _D k(\zeta ) \quad \forall \zeta \in L^\infty _{t,x}. \end{aligned}$$

Let A be an arbitrary Borel subset of \((0,T) \times D\), \(z_0 \in \mathbb {R}^n\), \(R>0\) a constant, and

$$\begin{aligned} \zeta _R := z_0 1_{A} + T_R(\nabla u) 1_{A^c}, \end{aligned}$$

where \(T_R:\mathbb {R}^n \rightarrow \mathbb {R}^n\), is the truncation operator

$$\begin{aligned} T_R: x \longmapsto {\left\{ \begin{array}{ll} x, &{}\quad |x| \le R,\\ \displaystyle R x/|x|, &{}\quad |x| > R. \end{array}\right. } \end{aligned}$$

Then \(\zeta _R \in L^\infty _{t,x}\), and

$$\begin{aligned}&\int _A k(\nabla u) + \int _A \eta \cdot (z_0 - \nabla u) \le \int _A k(z_0)\\&\quad + \int _{A^c} \left( k(T_R(\nabla u)) - k(\nabla u) \right) + \int _{A^c} \eta \cdot \left( T_R(\nabla u) - \nabla u \right) , \end{aligned}$$

where \(T_R(\nabla u) \rightarrow \nabla u\) and \(k(T_R(\nabla u)) \rightarrow k(\nabla u)\) a.e. in \((0,T) \times D\) as \(R \rightarrow \infty \), as well as

(the latter inequality follows by the assumptions on the behavior of k at infinity). Since \(k(\nabla u)\), \(k^*(\eta ) \in L^1_{t,x}\), the dominated convergence theorem implies that

$$\begin{aligned} \int _A k(\nabla u) + \int _A \eta \cdot (z_0 - \nabla u) \le \int _A k(z_0) \end{aligned}$$

for arbitrary \(z_0\) and A, hence also that

$$\begin{aligned} k(\nabla u) + \eta \cdot (z_0 - \nabla u) \le k(z_0) \end{aligned}$$

a.e. in \((0,T) \times D\) for all \(z_0 \in \mathbb {R}^n\). By definition of subdifferential it follows immediately that \(\eta = \gamma (\nabla u)\) a.e. in \((0,T) \times D\).

Let us now show, still keeping \(\omega \) fixed, that the limit u constructed above is unique. In particular, since \(\eta = \gamma (\nabla u)\), it is also unique. Assume that there exist \(u_1\), \(u_2\) such that

$$\begin{aligned} u_i(t) - \int _0^t {\text {div}}\gamma (\nabla u_i(s))\,ds = u_0 + G \cdot W(t), \quad i=1,2, \end{aligned}$$

in \(L^2\) for all \(t\in [0,T]\). Setting \(v = u_1-u_2\) and \(\zeta = \gamma (\nabla u_1) - \gamma (\nabla u_2)\), it is enough to show that

$$\begin{aligned} v(t) - \int _0^t {\text {div}}\zeta (s)\,ds = 0 \end{aligned}$$

in \(L^2\) for all \(t\in [0,T]\) implies \(v=0\). To this aim, it suffices to note that, by Lemma 4.8,

for all \(t \in [0,T]\). The monotonicity of \(\gamma \) immediately implies \(v=0\), i.e. \(u_1=u_2\), so that uniqueness of u is proved.

The process u has been constructed for each \(\omega \) in a set of probability one via limiting procedures along sequences that depend on \(\omega \) itself. Of course such a construction, in general, does not produce a measurable process. In our situation, however, uniqueness of u allows us to even prove that u is predictable. The following simple observation is crucial: we have proved that from any subsequence of \(\lambda \) one can extract a further subsequence \(\mu \), depending on \(\omega \), such that \(u_\mu \) converges to u as \(\mu \rightarrow 0\), in several topologies, and that the limit u is unique. This implies, by a classical criterion, that the same convergences hold along the original sequence \(\lambda \), which does not depend on \(\omega \). In particular, \(u_\lambda (\omega ,t) \rightarrow u(\omega ,t)\) weakly in \(L^2\) for all \(t\in [0,T]\) and for \(\mathbb {P}\)-a.s. \(\omega \). Let us show that \(u_\lambda \rightarrow u\) weakly in : for an arbitrary , we have

a.e. in \(\Omega \times [0,T]\), as well as

for a constant N independent of \(\lambda \), because \((u_\lambda )\) is bounded in by Lemma 4.3. Then \(\langle u_\lambda ,\phi \rangle \) is uniformly integrable in \(\Omega \times [0,T]\) by the criterion of de la Vallée Poussin, hence \(\langle u_\lambda ,\phi \rangle \rightarrow \langle u,\phi \rangle \) in by Vitali’s theorem. Since is arbitrary, it follows that \(u_\lambda \rightarrow u\) weakly in . Mazur’s lemma (see, e.g., [4, p. 360]) implies that there exists a sequence \((\zeta _n)\) of convex combinations of \(u_\lambda \) such that \(\zeta _n(\omega ,t) \rightarrow u(\omega ,t)\) in \(L^2\) in \(\mathbb {P}\otimes dt\)-measure, hence a.e. in \(\Omega \times [0,T]\) along a subsequence. Since \((u_\lambda )\) is a collection of \(L^2\)-valued predictable processes, the same holds for \((\zeta _n)\), so that the \(\mathbb {P}\otimes dt\)-a.e. pointwise limit u of (a subsequence of) \(\zeta _n\) is an \(L^2\)-valued predictable process as well. We also have that , as it follows by \(u_\lambda \rightarrow u\) in and the boundedness of \((u_\lambda )\) in .

Moreover, recalling that \(\nabla u_\lambda \rightarrow \nabla u\) and \(\gamma _\lambda (\nabla u_\lambda ) \rightarrow \eta \) weakly in \(L^1_{t,x}\) \(\mathbb {P}\)-a.s., and that, by Lemma 4.4, \((\nabla u_\lambda )\) and \((\gamma _\lambda (\nabla u_\lambda ))\) are bounded in , an entirely analogous argument shows that \(\nabla u_\lambda \rightarrow \nabla u\) and \(\gamma _\lambda (\nabla u_\lambda ) \rightarrow \eta =\gamma (\nabla u)\) weakly in . This implies that \(\eta \) is a measurable adapted process, as well as, by weak lower semicontinuity of the norm,

We can hence conclude that

Finally, Lemma 4.3 and (4.3) yield

where, by the weak lower semicontinuity of convex integrals and \((I+\lambda \gamma )^{-1}\nabla u_\lambda \rightarrow \nabla u\), \(\gamma _\lambda (\nabla u_\lambda ) \rightarrow \eta \) weakly in \(L^1_{t,x}\) \(\mathbb {P}\)-a.s., one has

$$\begin{aligned} \int _0^T\!\!\int _D \left( k(\nabla u) + k^*(\eta ) \right) \le \liminf _{\lambda \rightarrow 0} \int _0^T\!\!\int _D \left( k((I+\lambda \gamma )^{-1}\nabla u_\lambda ) + k^*(\gamma _\lambda (\nabla u_\lambda ))\right) \end{aligned}$$

\(\mathbb {P}\)-a.s., hence, by Fatou’s lemma,

(4.8)

Remark 4.9

The proof of uniqueness of u does not depend on \(\gamma \) being single-valued. In particular, all results on u obtained thus far, including the predictability of u, can be obtained under the more general assumption that \(\gamma \) is an everywhere defined maximal monotone graph on \(\mathbb {R}^n \times \mathbb {R}^n\), with \(\gamma =\partial k\). However, in this more general framework, the uniqueness of \(\eta \) does not follow, because the divergence is not injective. This implies that we would not be able even to prove that \(\eta \) is a measurable process (with respect to the product \(\sigma \)-algebra of \(\mathscr {F}\) and the Borel \(\sigma \)-algebra of [0, T]).

5 Well-posedness with additive noise

We are now going to prove well-posedness for the equation

$$\begin{aligned} du(t) - {\text {div}}\gamma (\nabla u(t))\,dt = G(t)\,dW(t), \quad u(0)=u_0, \end{aligned}$$
(5.1)

where G is no longer supposed to take values in \(\mathscr {L}^2(H,V_0)\), as in the previous section, but just in \(\mathscr {L}^2(H,L^2)\). In other words, we are considering Eq. (1.1) with additive noise.

Proposition 5.1

Assume that is \(\mathscr {F}_0\)-measurable and that \(G: \Omega \times [0,T] \rightarrow \mathscr {L}^2(H,L^2)\) is measurable and adapted. Then Eq. (4.1) is well posed in \(\mathscr {K}\). Moreover, its solution is pathwise weakly continuous with values in \(L^2\).

Proof

Since one has \((I-\varepsilon \Delta )^{-m}: L^2 \rightarrow H^{2m} \cap H^1_0\) for any \(m \in \mathbb {N}\), choosing \(m > 1/2 + n/4\), the Sobolev embedding theorem yields \(H^{2m} \hookrightarrow W^{1,\infty }\), hence \(V_0 := H^{2m} \cap H^1_0\) satisfies all hypotheses stated at the beginning of the previous section. In particular, setting

$$\begin{aligned} G^\varepsilon := (I - \varepsilon \Delta )^{-m} G, \end{aligned}$$

the ideal property of Hilbert–Schmidt operators implies that \(G^\varepsilon \) is an \(\mathscr {L}^2(H,V_0)\)-valued measurable and adapted process such that

It follows by Proposition 4.1 that, for any \(\varepsilon >0\), there exists a unique predictable process

such that

satisfying

$$\begin{aligned} u^\varepsilon (t) - \int _0^t {\text {div}}\eta ^\varepsilon (s)\,ds = u_0 + \int _0^t G^\varepsilon (s)\,dW(s) \end{aligned}$$
(5.2)

in \(L^2\) for all \(t \in [0,T]\).

In complete analogy to the previous section, the equation in \(H^{-1}\)

$$\begin{aligned} u^\varepsilon _\lambda (t) - \int _0^t {\text {div}}\gamma _\lambda (\nabla u^\varepsilon _\lambda (s))\,ds - \lambda \int _0 \Delta u^\varepsilon _\lambda (s)\,ds = u_0 + \int _0^t G^\varepsilon (s)\,dW(s) \end{aligned}$$

admits a unique (variational) strong solution \(u_\lambda ^\varepsilon \) for any \(\varepsilon >0\) and \(\lambda >0\). Taking into account the monotonicity of \(\gamma _\lambda \), Itô’s formula yields, for any \(\delta >0\),

Taking supremum in time and expectation, it easily follows from Lemma 4.2 that

For arbitrary fixed \(\varepsilon \), \(\delta >0\), the proof of Proposition 4.1 shows that

$$\begin{aligned} u^\varepsilon _\lambda&\longrightarrow u^\varepsilon&\text {weakly* in } L^\infty _tL^2_x,\\ \nabla u^\varepsilon _\lambda&\longrightarrow \nabla u^\varepsilon&\text {weakly in } L^1_{t,x},\\ \gamma _\lambda (\nabla u^\varepsilon _\lambda )&\longrightarrow \eta ^\varepsilon&\text {weakly in } L^1_{t,x} \end{aligned}$$

\(\mathbb {P}\)-a.s. as \(\lambda \rightarrow 0\), and the same holds replacing \(\varepsilon \) with \(\delta \). In particular, on a set of probability one, \(u^\varepsilon _\lambda -u^\delta _\lambda \rightarrow u^\varepsilon -u^\delta \) weakly* in \(L^\infty _tL^2_x\) as \(\lambda \rightarrow 0\), hence the weak* lower semicontinuity of the norm and Fatou’s lemma imply

It follows by the ideal property of Hilbert–Schmidt operators, the contractivity of \((I-\varepsilon \Delta )^{-m}\), and the dominated convergence theorem, that

as \(\varepsilon \rightarrow 0\). This implies that \((u^\varepsilon )\) is a Cauchy sequence in , hence there exists a predictable \(L^2\)-valued process u such that \(u^\varepsilon \) converges (strongly) to u in as \(\varepsilon \rightarrow 0\). Moreover, by (4.8) there exists a constant N, independent of \(\varepsilon \), such that

(5.3)

where we have used again the ideal property of Hilbert–Schmidt operators and the contractivity of \((I-\varepsilon \Delta )^{-m}\) in the last step. The sequences \((\nabla u^\varepsilon )\) and \((\gamma (\nabla u^\varepsilon ))\) are hence uniformly integrable on \(\Omega \times [0,T] \times D\) by the criterion of de la Vallée Poussin, hence relatively weakly compact in by the Dunford-Pettis theorem. Therefore, passing to a subsequence of \(\varepsilon \), denoted by the same symbol, there exist v and \(\eta \) such that \(\nabla u^\varepsilon \rightarrow v\) and \(\gamma (\nabla u^\varepsilon ) \rightarrow \eta \) weakly in as \(\varepsilon \rightarrow 0\). It is then straightforward to check that \(v=\nabla u\) and

An argument based on Mazur’s lemma, entirely analogous to the one used in the proof of Proposition 4.1, shows that \(\eta \) is an \(L^1\)-valued adapted process.

We can now pass to the limit as \(\varepsilon \rightarrow 0\) in (5.2). The strong convergence of \(u^\varepsilon \) to u in implies that

in probability as \(\varepsilon \rightarrow 0\). Let \(\phi _0 \in V_0\) be arbitrary. Since \(V_0 \hookrightarrow L^\infty \), one has

in probability for almost all \(t \in [0,T]\). Let us set, for an arbitrary but fixed \(t \in [0,T]\), \(\phi :s \mapsto 1_{[0,t]}(s) \phi _0 \in L^\infty _t V_0\). Recalling that \(\eta ^\varepsilon =\gamma (\nabla u^\varepsilon ) \rightarrow \eta \) weakly in , it follows immediately that

$$\begin{aligned} - \int _0^t \langle {\text {div}}\eta ^\varepsilon ,\phi _0 \rangle \,ds&= \int _0^T\!\!\int _D\eta ^\varepsilon (s)\cdot \phi (s)\,ds \\&\quad \rightarrow \int _0^T\!\!\int _D\eta (s)\cdot \nabla \phi (s)\,ds = - \int _0^t \langle {\text {div}}\eta (s),\phi _0 \rangle \,ds \end{aligned}$$

weakly in as \(\varepsilon \rightarrow 0\). Doob’s maximal inequality and the convergence

as \(\varepsilon \rightarrow 0\) readily yield also that \(G^\varepsilon \cdot W(t) \rightarrow G \cdot W(t)\) in \(L^2\) in probability for all \(t \in [0,T]\). In particular, since \(\phi _0\in V_0\) and \(t \in [0,T]\) are arbitrary, we infer that

$$\begin{aligned} u(t) - \int _0^t {\text {div}}\eta (s)\,ds = u_0 + \int _0^t B(s)\,dW(s) \end{aligned}$$

holds in \(V_0'\) for almost all t. Recalling that \(\eta \in L^1_{t,x}\), which implies in turn that \({\text {div}}\eta \in L^1_t V_0'\), it follows that all terms except the first on the left-hand side have trajectories in \(C_t V_0'\), hence that the identity holds for all \(t \in [0,T]\). Moreover, thanks to Strauss’ weak continuity criterion, \(u \in C_t V_0'\) and \(u \in L^\infty _t L^2_x\) imply \(u \in C_w([0,T];L^2)\). Note also that all terms bar the second one on the left-hand side are \(L^2\)-valued, hence the identity holds also in \(L^2\) for all \(t \in [0,T]\).

The weak convergences \(\nabla u^\varepsilon \rightarrow \nabla u\) and \(\eta ^\varepsilon \rightarrow \eta \) in and the weak lower semicontinuity of convex integrals yield, taking (5.3) into account,

To complete the proof of existence, we only need to show that \(\eta = \gamma (\nabla u)\) a.e. in \(\Omega \times (0,T) \times D\). Note that, by Proposition 3.1, we have

and

where, as \(\varepsilon \rightarrow 0\), \(||u^\varepsilon (T) || \rightarrow ||u(T) ||\) in , thanks to the strong convergence of \(u^\varepsilon \) to u in ;

in by an (already seen) argument involving the ideal property of Hilbert–Schmidt operators;

$$\begin{aligned} \int _0^T u^\varepsilon (s)G^\varepsilon (s)\,dW(s) \longrightarrow \int _0^Tu(s)G(s)\,dW(s) \end{aligned}$$

in as it follows by Lemma 4.2. In particular, we infer

hence also, by Fatou’s lemma,

$$\begin{aligned} \limsup _{\varepsilon \rightarrow 0} \mathop {{}\mathbb {E}}\int _0^T\!\!\int _D \gamma (\nabla u^\varepsilon ) \cdot \nabla u^\varepsilon \le \mathop {{}\mathbb {E}}\int _0^t\!\!\int _D \eta \cdot \nabla u. \end{aligned}$$

Since \(\nabla u^\varepsilon \rightarrow \nabla u\) and \(\gamma (\nabla u^\varepsilon ) \rightarrow \eta \) weakly in , recalling that \(\gamma \) is maximal monotone, it follows that \(\eta \in \gamma (\nabla u)\) a.e. in \(\Omega \times (0,T)\times D\) (see, e.g., [2, Lemma 2.3, p. 38]).

Let \(u_{01}\), be \(\mathscr {F}_0\)-measurable, and \(G_1\), \(G_2:\Omega \times [0,T] \rightarrow \mathscr {L}^2(H,L^2)\) be measurable adapted processes such that

If \(u_i \in \mathscr {K}\), \(i=1,2\), are solutions to

$$\begin{aligned} du_i - {\text {div}}\gamma (\nabla u_i)\,dt = G_i\,dW, \quad u_i(0)=u_{0i}, \end{aligned}$$

we are going to show that

(5.4)

from which uniqueness and Lipschitz-continuous dependence on the initial datum follow immediately. We shall actually obtain this estimate as a special case of a more general one that will be useful in the next section: setting

$$\begin{aligned} y(t) := u_1(t) - u_2(t), \quad y_0 := u_{01} - u_{02}, \quad F(t) := G_1(t) - G_2(t), \end{aligned}$$

one has

$$\begin{aligned} y(t) - \int _0^t {\text {div}}\zeta (s)\,ds = y_0 + \int _0^t F(s)\,dW(s), \end{aligned}$$

where \(\zeta = \gamma (\nabla u_1) - \gamma (\nabla u_2)\). Setting, for any \(\alpha \ge 0\),

$$\begin{aligned} y^\alpha (t) := e^{-\alpha t} y(t), \quad \zeta (t) := e^{-\alpha t} \zeta (t), \quad F^\alpha (t) := e^{-\alpha t} F(t), \end{aligned}$$

the integration by parts formula yields

$$\begin{aligned} y^{\alpha }(t) + \int _0^t \left( \alpha y^{\alpha }(s) - {\text {div}}\zeta ^{\alpha }(s) \right) \,ds = y_0 + \int _0^t F^{\alpha }(s)\,dW(s), \end{aligned}$$

from which, by Proposition 3.1, we deduce

where, by monotonicity of \(\gamma \), \(\zeta ^\alpha \cdot \nabla y^\alpha = e^{-2\alpha \cdot } \left( \gamma (\nabla u_1) - \gamma (\nabla u_2)\right) \cdot (\nabla u_1-\nabla u_2) \ge 0\). Therefore, taking the supremum in t and expectation on both sides, one has

(5.5)

where the second inequality follows by an application of Lemma 4.2. Estimate (5.4) is just the special case \(\alpha =0\). \(\square \)

6 Proof of the main result

Thanks to the results established thus far, we are now in the position to prove Theorem 2.2. Let \(v:\Omega \times [0,T] \rightarrow L^2\) be a measurable adapted process such that

and consider the equation

$$\begin{aligned} du(t) -{\text {div}}\gamma (\nabla u(t))\,dt = B(t,v(t))\,dW(t), \quad u(0)=u_0, \end{aligned}$$

where \(u_0\) is an \(\mathscr {F}_0\)-measurable \(L^2\)-valued random variable with finite second moment. The assumptions on B imply that \(B(\cdot ,v)\) is measurable, adapted, and such that

hence the above equation is well-posed in \(\mathscr {K}\) by Proposition 5.1, which allows one to define a map \(\Gamma :(u_0,v) \mapsto u\). Let \(u_i=\Gamma (u_{0i},v_i)\), \(i=1,2\), where \(u_{0i}\) and \(v_i\) satisfy the same measurability and integrability assumptions on \(u_0\) and v, respectively. For any \(\alpha \ge 0\), (5.5) and the Lipschitz continuity of B yield

Choosing \(\alpha \) large enough, it follows that, for any \(u_0\) as above, the map \(v \mapsto \Gamma (u_0,v)\) is strictly contractive in the Banach space \(E_\alpha \) of measurable adapted processes v such that

$$\begin{aligned} ||v ||_{E_\alpha } := \left( \mathop {{}\mathbb {E}}\int _0^T e^{-2\alpha s} ||v(s) ||^2\,ds \right) ^{1/2}. \end{aligned}$$

By the Banach fixed point theorem, the map \(v \mapsto \Gamma (u_0,v)\) admits a unique fixed point u in \(E_\alpha \). Since all \(E_\alpha \)-norms are equivalent for different values of \(\alpha \), u belongs to \(E_0\) and, by definition of \(\Gamma \), u also belongs to \(\mathscr {K}\) and solves (1.1). Taking into account that any solution to (1.1) is necessarily a fixed point of \(v \mapsto \Gamma (u_0,v)\), it immediately follows that u is the unique solution to (1.1) in \(\mathscr {K}\). Lipschitz continuity of the solution map follows from the above estimate, which manifestly implies

and

thus completing the proof.