Ergodicity and Kolmogorov equations for dissipative SPDEs with singular drift: a variational approach

We prove existence of invariant measures for the Markovian semigroup generated by the solution to a parabolic semilinear stochastic PDE whose nonlinear drift term satisfies only a kind of symmetry condition on its behavior at infinity, but no restriction on its growth rate is imposed. Thanks to strong integrability properties of invariant measures $\mu$, solvability of the associated Kolmogorov equation in $L^1(\mu)$ is then established, and the infinitesimal generator of the transition semigroup is identified as the closure of the Kolmogorov operator. A key role is played by a generalized variational setting.


Introduction
Our goal is to study the asymptotic behavior of solutions to semilinear stochastic partial differential equations on a smooth bounded domain D ⊆ R n of the form dX t + AX t dt + β(X t ) dt ∋ B(X t ) dW t , X(0) = X 0 . (1.1) Here A : V → V ′ is a linear maximal monotone operator from a Hilbert space V to its dual V ′ , and V ⊂ H := L 2 (D) ⊂ V ′ is a so-called Gelfand triple; β is a maximal monotone graph everywhere defined on R; W is a cylindrical Wiener process on a separable Hilbert space U , and B takes values in the space of Hilbert-Schmidt operators from U to L 2 (D). Precise assumptions on the data of the problem are given in §2 below. The most salient point is that β is not assumed to satisfy any growth assumption, but just a kind of symmetry on its rate of growth at plus and minus infinity -see assumption (vi) in §2 below. Well-posedness of equation (1.1) in the strong (variational) sense has recently been obtained in [14] by a combination of classical results by Pardoux and Krylov-Rozovskiȋ (see [11,16]) with pathwise estimates and weak compactness arguments. The minimal assumptions on the drift term β imply that, in general, the operator A + β does not satisfy the coercivity and boundedness assumptions required by the variational approach of [11,16]. For this reason, questions such as ergodicity and existence of invariant measures for (1.1) cannot be addressed using the results by Barbu and Da Prato in [3], which appear to be the only ones available for equations in the variational setting (cf. also [15]). On the other hand, there is a very vast literature on these problems for equations cast in the mild setting, references to which can be found, for instance, in [6,7,17]. Even in this case, however, we are not aware of results on equations with a drift term as general as in (1.1). Our results thus considerably extend, or at least complement, those on reaction-diffusion equations in [5,6,7], for instance, where polynomial growth assumptions are essential. More recent existence and integrability results for invariant measures of semilinear equations have been obtained, e.g., in [9,10], but still under local Lipschitz-continuity or other suitable growth assumptions on the drift. Another possible advantage of our results is that we use only standard monotonicity assumptions, whereas in a large part of the cited literature one encounters assumptions of the type Ax + β(x + y), z ≤ f ( y ) − k x for some (or all) z belonging to the subdifferential of x , where f is a function and k a constant.
Here A actually stands for the part of A in a Banach space E continuously embedded in L 2 (D), ·, · stands for the duality between E and its dual, and the condition is assumed to hold for those x, y for which all terms are well defined. Often E is chosen as a space of continuous functions such as C(D). This monotonicity-type condition on A and β is precisely what one needs in order to obtain a priori estimates by reducing the original equation to a deterministic one with random coefficients, under the assumption of additive noise. Using a figurative but rather accurate expression, this methods amounts to "subtracting the stochastic convolution". Our estimates are obtained mostly by stochastic calculus, for which the standard notion of monotonicity suffices. Among such estimates we obtain the integrability of (the potential of) the nonlinear drift term β with respect to the invariant measure µ, which is known to be a delicate issue, especially for non-gradient systems (cf. the discussion in [9]). These results allow us to show that the Kolmogorov operator associated to the stochastic equation (1.1) with additive noise is essentially m-dissipative in L 1 (H, µ). This implies that the closure of the Kolmogorov operator in L 1 (H, µ) generates a Markovian semigroup of contractions, which is a µ-version of the transition semigroup generated by the solution to the stochastic equation. It is worth mentioning that the variational-type setting, while allowing for a very general drift term β, gives raise to quite many technical issues in the study of Kolmogorov equations, for instance because test functions in function spaces on V and V ′ naturally appear. We conclude this introductory section with a brief description of the structure of the paper and of the main results. In Section 2 we state the basic assumptions which are in force throughout the paper, and recall the well-posedness result for equation (1.1) obtained in [14]. For the reader's convenience we collect in Section 3 some tools needed in the sequel, such as Prokhorov's theorem on compactness of sets of probability measures, and the Krylov-Bogoliubov criterion for the existence of an invariant measure for a Markovian transition semigroup. Section 4 is devoted to auxiliary results, most of which should be interesting in their own right, that underpin our subsequent arguments. In particular, we prove two generalized versions of the classical Itô formula in the variational setting for equation (1.1): one for the square of the norm, and another one extending a very useful but not-so-well known version for more general smooth functions, originally obtained by Pardoux (see [16, p. 62-ff]). Furthermore, we establish results on the first and second-order differentiability, both in the Gâteaux and Fréchet sense, of (variational) solutions to semilinear equations with regular drift with respect to the initial datum. In Section 5 we prove that the transition semigroup P generated by the solution to (1.1) admits an ergodic invariant measure µ, which in also shown to be unique and strongly mixing if β is superlinear. These results follow mainly by a priori estimates (which, in turn, are obtained by stochastic calculus) and compactness. Finally, Section 6 deals with the Kolmogorov equation associated to (1.1). In particular, we characterize the infinitesimal generator −L of the transition semigroup P on L 1 (H, µ) as the closure of the Kolmogorov operator −L 0 . After showing that L 0 is dissipative and coincides with L on a suitably chosen dense subset of L 1 (H, µ), we prove that the image of I + L 0 is dense in L 1 (H, µ), so that the Lumer-Phillips theorem can be applied. Due to the variational formulation of the problem, the latter point turns out to be rather delicate, even though the general approach follows a typical scheme: we first introduce appropriate regularizations of L 0 , for which the Kolmogorov equation can be solved by established techniques, then we pass to the limit in the regularization's parameters. Here the generalized Itô formulas and the differentiability results proved in Section 4 play a key role.

General assumptions and well-posedness
Before stating the hypotheses on the coefficients and on the initial datum of equation (1.1) that will be in force throughout the paper, let us fix some notation.

Notation
Given two Banach (real) spaces E and F , the space of bounded linear operators from E to F will be denoted by L (E, F ). When F = R, we shall just write E ′ . If E and F are Hilbert spaces, L 2 (E, F ) stands for the ideal of L (E, F ) of Hilbert-Schmidt operators. The Hilbert space L 2 (D) will be denoted by H, and its norm and scalar product by · and ·, · , respectively. For any topological space E, the Borel σ-algebra on E will be denoted by B(E). All measures on E are intended to be defined on its Borel σ-algebra, unless otherwise stated. The spaces of bounded Borel-measurable and bounded continuous functions on E will be denoted by B b (E) and C b (E), respectively.

Assumptions
Let V be a separable Hilbert space densely, continuously and compactly embedded in H = L 2 (D). The duality form between V and V ′ is also denoted by ·, · , as customary. We assume that A ∈ L (V, V ′ ) satisfies the following properties: (ii) the part of A in H can be uniquely extended to an m-accretive operator A 1 on L 1 (D); (iii) for every δ > 0, the resolvent (I + δA 1 ) −1 is sub-Markovian, i.e. for every f ∈ L 1 (D) such that 0 ≤ f ≤ 1 a.e. on D, we have 0 ≤ (I + δA) −1 f ≤ 1 a.e. on D; (iv) there exists m ∈ N such that (I + δA 1 ) −m ∈ L (L 1 (D), L ∞ (D)).
Let us now consider the non-linear term in the drift. We assume that (v) β ⊂ R × R is a maximal monotone graph such that 0 ∈ β(0) and D(β) = R.
Let j : R → R + be the unique convex lower semicontinuous function such that j(0) = 0 and β = ∂j, in the sense of convex analysis. We assume that This hypothesis is obviously satisfied if j (or, equivalently, β) is symmetric. Denoting the convex conjugate of j by j * , it is well known that the hypothesis D(β) = R is equivalent to the superlinearity of j * at infinity, i.e.
We are going to need the following property implied by assumption (vi): there exists a strictly positive number η such that, for every measurable function y : D → R, j * (y) ∈ L 1 (D) implies j * (η|y|) ∈ L 1 (D). In fact, from (vi) we deduce that there exist R > 0 and M 1 = M 1 (R) > 0 such that j(r) ≤ M 1 j(−r) for |r| ≥ R. Since j ≥ 0, one can choose M 1 > 1 without loss of generality. Setting M 2 := max{j(r) : |r| ≤ R}, which is finite by continuity of j, we deduce that Taking convex conjugates on both sides we infer that Setting η := 1/M 1 < 1 and recalling that j * (0) = 0, hence j * is positive on R and increasing on R + , one has The assumptions on the Wiener process W and the diffusion coefficient B are standard: let U be a separable Hilbert space and W a cylindrical Wiener process on U , defined on a filtered probability space (Ω, F , (F t ) t∈[0,T ] , P) satisfying the so-called usual conditions. 1 We assume that (vii) B : H → L 2 (U, H) is Lipschitz-continuous and with linear growth, i.e. that there exists a positive constants L B such that Finally, the initial datum X 0 is assumed to be F 0 -measurable and such that E X 0 2 is finite. All hypotheses just stated will be tacitly assumed to hold throughout.
The following well-posedness result for equation (1.1) has been proved in [14], allowing the diffusion coefficient B to be also random and time-dependent. Theorem 2.1. There is a unique pair (X, ξ), with X a V -valued adapted process and ξ an L 1 (D)-valued predictable process, such that and Moreover, X is P-a.s. pathwise weakly continuous from [0, T ] to H, and the solution map Remark 2.2. In a forthcoming work we shall prove that the solution X is actually pathwise continuous, not just weakly continuous, and that well-posedness continues to hold under local Lipschitz-continuity and linear growth assumptions on B. We shall also show how the regularity of the solution X depends on the regularity of the diffusion coefficient B.
1 Expressions involving random elements are always meant to hold P-a.s. unless otherwise stated.

Compactness in spaces of probability measures
The set of probability measures on E is denoted by M 1 (E) and endowed with the topology σ(M 1 (E), C b (E)), which we shall call the narrow topology. We recall that a subset N of M 1 (E) is called (uniformly) tight if for every ε > 0 there exists a compact set K ε such that µ(E \K ε ) < ε for all µ ∈ N . The following characterization of relative compactness of sets of probability measures is classical (see, e.g., [4, §5.5]).

Markovian semigroups and ergodicity
A family P = (P t ) t≥0 of Markovian kernels on a measure space (E, E ) such that P t+s = P t P s for all t, s ≥ 0 is called a Markovian semigroup. We recall that a Markovian kernel on (E, E ) is a map is a measure on E for each x ∈ E, and (iii) K(x, E) = 1 for each x ∈ E. A Markovian kernel K on (E, E ) can naturally be extended to the space bE of E -measurable bounded functions by the prescription Then K : bE → bE is a linear, bounded, positive, σ-order continuous map. Similarly, K can be extended to positive measures on E setting The notations P t f and µP t , with f E -measurable bounded or positive function and µ positive measure on E , are hence to be understood in this sense. We shall also assume that P 0 = I and that (t, x) A probability measure µ on E is said to be an invariant measure for the Markovian semigroup P if or, equivalently, if µP t = µ for all t ≥ 0. If P admits an invariant measure µ, then it can be extended to a Markovian semigroup on L p (E, µ), for every p ≥ 1. The invariant measure µ is said to be ergodic for P if and strongly mixing if We recall the following classical fact on the structure of the set of ergodic measures: the ergodic invariant measures for P are the extremal points of the set of its invariant measures. In particular, if P admits a unique invariant measure µ, then µ is ergodic. In order to state a criterion for the existence of invariant measures, let us introduce, for any probability measure ν ∈ M 1 (E), the family of averaged measures (µ ν t ) t≥0 defined as Theorem 3.2 (Krylov and Bogoliubov). Let (P t ) t≥0 be a (time-homogeneous) Markovian transition semigroup on a complete separable metric space E. Assume that (a) (P t ) t≥0 has the Feller property, i.e. that it maps Then the set of invariant measures for (P t ) t≥0 is non-empty.
Note that if x ∈ E and ν is the Dirac measure at x, then νP s = P s (x, ·). Then condition (b) is satisfied if there exists x ∈ E such that the family of measures is tight. It is easily seen that this latter condition is in turn satisfied if (P t (x, ·)) t≥0 ⊂ M 1 (E) is tight.

Auxiliary results
To prove the main results we shall need some auxiliary results that are interesting in their own right, and that are collected in this section. In particular, we recall or prove some Itô-type formulas and provide conditions for the differentiability of solutions to equations in variational form with respect to the initial datum.

Itô formulas
The following version of Itô's formula for the square of the norm is [14, Proposition 6.2]. is such that g is an adapted L 1 (D)-valued process such that g ∈ L 0 (Ω; L 1 (0, T ; L 1 (D))), and there exists α > 0 for which Proof. Since the resolvent of A 1 is ultracontractive by assumption, there exists m ∈ N such that Using a superscript δ to denote the action of (I + δA 1 ) −k , we have where g δ ∈ L 1 (0, T ; H), hence the classical Itô formula yields, for every δ > 0, We are now going to pass to the limit as δ → 0. By the assumptions on A and the regularity properties of Y , g, Y 0 , and G, one has

This implies
Using the dominated convergence theorem, it is not difficult to check that converges to zero in probability, hence also (along a subsequence) Finally, the symmetry assumption on j ensures that (g δ Y δ ) is uniformly integrable on (0, T )× D, so that We shall also need a simplified version of an Itô formula in the variational setting, due to Pardoux, for functions more general than the square of the H-norm. For its proof (in a more general context) we refer to [16, p. 62-ff.].
The previous Itô formula can be extended to processes satisfying weaker integrability conditions, in analogy to Proposition 4.1.
Proof. Since the resolvent of A 1 is ultracontractive by assumption, there exists m ∈ N such that Using a superscript δ to denote the action of (I + δA 1 ) −m , we have for every t ∈ [0, T ], P-almost surely. Let us pass to the limit as δ → 0 in the previous equation. It is clear from the fact that Y (t), Y 0 ∈ H and the continuity of F that , we have (possibly along a subsequence) Finally, by the Davis inequality and the ideal property of Hilbert-Schmidt operators, we have where the first term on the right-hand side converges to 0 because Similarly, since DF (Y δ ) → DF (Y ) a.e., it follows by the dominated convergence theorem that the second term on the right-hand side converges to zero as well. Therefore, passing to subsequence if necessary, one has

Differentiability with respect to the initial datum for solutions to equations in variational form
Let g ∈ C 2 b (R) and consider the equation in the variational sense, where A satisfies the hypotheses of Section 2, G ∈ L 2 (U, H), and x ∈ H.
For compactness of notation we shall write E in place of C([0, T ]; H) ∩ L 2 (0, T ; V ). The above equation admits a unique variational solution X x ∈ L 2 (Ω; E). Here and in the following we often use superscripts to denote the dependence on the initial datum. We are going to provide sufficient conditions ensuring that the solution map x → X x belongs to C 2 b (H; L 2 (Ω; E)). The problem of regular dependence on the initial datum for equations in the variational setting does not seem to be addressed in the literature. On the other hand, several results are available for mild solutions (see, e.g., [5,7,13]), where an approach via the implicit function theorem depending on a parameter is adopted. Here we proceed in a more direct and, we believe, clearer way. The results are non-trivial (and probably not easily accessible via the implicit function theorem) in the sense that the solution map is Fréchet differentiable even though, as is well known, the superposition operator associated to g is never Fréchet differentiable unless g is affine. The first and second Fréchet derivative of the solution map shall be denoted by DX and D 2 X, respectively. These are maps with domain H and codomain L (H, L 2 (Ω; E)) and L 2 (H; L 2 (Ω; E)), respectively. Here and in the following we denote the space of continuous bilinear mappings from H × H to a Banach space F by L 2 (H; F ).
We begin with first-order differentiability.
in the variational sense.
Proof. Classical (deterministic) results imply that (4.1) admits a unique solution Y h ∈ E for P-a.e. ω ∈ Ω. Since X x is an adapted process and h is non-random, it follows that Y h is itself adapted. Alternatively, and more directly, one can apply the stochastic variational theory to (4.1), deducing that Y h ∈ L 2 (Ω; E) is adapted. Let us set, for compactness of notation, where ε is an arbitrary real number. Elementary calculations show that By the integration-by-parts formula applied to the equation for z ε we get where S ε , z ε ≤ S ε z ε and, by the Lipschitz continuity of g, For an arbitrary t > 0 one has, by the coercivity of A, hence also, by Fubini's theorem and Gronwall's inequality, It is clear from the hypotheses on g and the definition of . Moreover, it follows by the Lipschitz continuity of g and elementary estimates that This proves that the solution map is differentiable in every direction of H, and that its directional derivative in the direction h ∈ H is given by the (unique) solution Y h to (4.1). It is then clear that the map h → Y h is linear. Let us prove that it is also continuous: in analogy to computations already carried out above, the integration-by-parts formula yields hence also, by Gronwall's inequality and elementary estimates, It is important to note that this inequality holds P-a.s. with a non-random implicit constant that depends only on T and on the Lipschitz constant of g, but not on the initial datum x. From this it follows that we are going to prove that the map is continuous. This implies, by a well-known criterion (see, e.g., [2, Theorem 1.9]), that x → X x is Fréchet differentiable with Fréchet derivative (necessarily) equal to Y x . Let (x n ) ⊂ H be a sequence converging to x in H, and write for simplicity X n := X xn , Y n := Y xn , X := X x , and Y := Y x , with a subscript h to denote their action on a fixed element h ∈ H. One has for which the integration-by-parts formula yields Taking the supremum in time, Gronwall's inequality implies where the implicit constant depends on C, T and on the Lipschitz constant of g. Furthermore, since, as observed above, h → Y h is a linear bounded map from H to C([0, T ]; H) P-a.s. with non-random operator norm, i.e.
and the last term converges to zero as n → ∞ by the dominated convergence theorem, because X n → X in L 2 (Ω; C([0, T ]; H)) and g ∈ C 2 b (in particular, g ′ is Lipschitz-continuous). It immediately follows that x → Y x is a continuous map on H with values in L (H, L 2 (Ω; E)). Furthermore, since we have shown that Y x h L p (Ω;E) h for all p ≥ 0 with a constant independent of x, we conclude that x → X x is of class C 1 b from H to L 2 (Ω; E). To establish the second-order Fréchet differentiability of x → X x , it is convenient to consider the equation where h, k ∈ H and Y h , Y k are the solutions to (4.1) with initial conditions h and k, respectively. This is manifestly the equation formally satisfied by the second-order Fréchet derivative of x → X x evaluated at (h, k).
In order to prove that (4.2) is well-posed, we need the following lemma, which is probably well known, but for which we could not find a reference, except for the classical case where f ∈ L 2 (0, T ; V ′ ) (see, e.g., [12]). such that y ′ n (t) + Ay n (t) = ℓ(t)y n (t) + f n (t) in V ′ for a.e. t ∈ (0, T ) , y n (0) = y 0 .
Therefore, for every n, m ∈ N, the integration-by-parts formula and an easy computation show that We deduce that there exists y ∈ C([0, T ]; H) ∩ L 2 (0, T ; V ) such that It clear follows from y ∈ L 2 (0, T ; V ) and A ∈ L (V, V ′ ) that Ay ∈ L 2 (0, T ; V ′ ) and Ay n → Ay in L 2 (0, T ; V ′ ) as n → ∞. Moreover, we also have that In order to prove second-order Fréchet differentiability of the solution map x → X x we need to make the further assumption that V is continuously embedded in L 4 (D). This is satisfied, for instance, if V = H 1 0 and d ≤ 4. In fact, by the Sobolev embedding theorem, for d ≥ 3 and 2 * = +∞ otherwise. We proceed as follows: first we establish well-posedness for equation (4.2), and then we show that its unique solution identifies D 2 X. Proof. Hölder's inequality and the boundedness of g ′′ yield Let us show that (h, k) → Z hk is a continuous bilinear map. The bilinearity is clear from equation (4.2). Moreover, testing by Z hk and using the coercivity of A we have that h k and Gronwall's inequality yields Z x hk L 2 (Ω;C([0,T ];H))∩L 2 (Ω;L 2 (0,T ;V )) h k ∀h, k, x ∈ H, from which the last assertion follows.
Theorem 4.7. Assume that V is continuously embedded in L 4 (D). Then the solution map x → X x is of class C 2 b from H to L 2 (Ω; E). Proof. We are going to prove first that the Fréchet derivative of the solution map is Gâteauxdifferentiable, with Gâteaux derivative equal to Z x := (h, k) → Z x hk , then we shall then show that x → Z x is continuous and bounded as a map from H to L 2 (H; L 2 (Ω; E)).
Step 1. Let x ∈ H be arbitrary but fixed, and consider the family of maps z ε ∈ L 2 (H; L 2 (Ω; E)), indexed by ε ∈ R, defined as Elementary manipulations based on the equations satisfied by Y x and Z x show that where the integrand on the right-hand side can be written as R ε + S ε , with

Further algebraic manipulations show that
The integration-by-parts formula and obvious estimates yield Taking the supremum on both sides, one is left with, thanks to Young's inequality, for all δ > 0, from which it follows, taking δ sufficiently small and applying Gronwall's inequality, We are going to show that the right-hand side tends to zero as ε → 0. Since g ∈ C 2 b , it is evident that R ′ ε → 0 almost everywhere as ε → 0 as well as that .

Recalling that, by Theorem 4.4, (X x+εk
We thus conclude that lim i.e. the directional derivative of x → Y x : H → L (H, L 2 (Ω; E)) exists for all directions and is given by the map x → Z x : H → L 2 (H; L 2 (Ω; E)). Since we have already proved that (h, k) → Z x hk is bilinear and continuous, we infer that x → Y x is Gâteaux differentiable with derivative Z x .
Step 2. In order to conclude that x → Y x is Fréchet differentiable (with derivative necessarily equal to Z) it is enough to show, in view of a criterion already mentioned, that the map is continuous. Let (x n ) n ⊆ H be a sequence converging to x in H. We have, writing Z n in place of Z xn for simplicity, with initial condition Z n hk (0) − Z hk (0) = 0. The right-hand side of the equation can be written as R = i≤4 R i , with so that, by the integration-by-parts formula, and, for i = 1, by Young's inequality, By an argument based on the Gronwall's inequality already used several times we obtain where R 2 ≤ g ′′ ∞ (X n − X)Z hk and, by the bilinearity of Z, where both terms on the right-hand side tend to zero because It remains to consider R 4 : it is clear that (g ′′ (X n ) − g ′′ (X))Y h Y k → 0 almost everywhere by the continuity of g ′′ , and, as before, We have thus proved that, as n → ∞, Recalling that x → Z x is bounded on H, we conclude that x → X x is twice Fréchet-differentiable with continuous and bounded derivatives.

Invariant measures
Throughout this section, we consider equation (1.1) with X 0 ∈ H. Since all coefficients do not depend explicitly on ω ∈ Ω, it follows by a standard argument that the solution X to (1.1) is Markovian. Let P = (P t ) t≥0 be the transition semigroup defined by We shall assume from now on that the pair (A, B) satisfies the coercivity condition where the last term is finite thanks to Theorem 2.1 and the assumption of linear growth on B. Therefore, recalling that, for any r, s ∈ R, j(r) + j * (s) = rs if and only if s ∈ β(r), one has, taking the coercivity condition (5.1) into account, for all t ≥ 0. Let x = 0. For any t ≥ 0 the law of the random variable X(t) is a probability measure on H, which we shall denote by π t . We are now going to show that the family of measures (µ t ) t>0 on H defined by is tight. The ball B n in V of radius n ∈ N is a compact subset of H, because the embedding V ֒→ H is compact. Moreover, Markov's inequality and (5.2) yield It follows by Prokhorov's theorem that there exists a probability measure µ on H and a sequence (t k ) k∈N increasing to infinity such that µ t k converges to µ in the topology σ(M 1 (H), C b (H)) as k → ∞. Furthermore, µ is an invariant measure for the transition semigroup P , thanks to the Krylov-Bogoliubov theorem.
We are now going to prove integrability properties of all invariant measures, which in turn provide information on their support. We start with a (relatively) simple yet crucial estimate.

Proposition 5.2. Let µ be an invariant measure for the transition semigroup (P t ). Then one has
where K is the norm of the embedding V ֒→ H.
Proof. We are going to apply the Itô formula of Proposition 4.2 to the process X and the function Since g ′ δ > 0 and g ′′ δ < 0, the coercivity condition (5.1) and the monotonicity of β imply Taking into account that |g ′ δ | ≤ 1, the stochastic integral is a martingale, exactly as in the proof of Theorem 5.1, hence has expectation zero, so that By definition of (P t ) we have P t G δ (x) = E G δ (X(t)), from which it follows, by the boundedness of G δ and by definition of invariant measure, Denoting the norm of the embedding V ֒→ H by K, we get hence, by Tonelli's theorem and invariance of µ, Taking the limit as δ → 0, the monotone convergence theorem yields In order to state the next integrability results for invariant measures, we need to define the following subsets of H: whose Borel measurability will be proved in Lemma 5.4 below.
where K is the norm of the embedding V ֒→ H. In particular, µ is concentrated on V ∩ J ∩ J * .
Proof. Let us introduce the functions Φ, Ψ, Ψ * : as well as their approximations Φ n , Ψ n , Ψ * n : H → R + ∪ {+∞}, n ∈ N, defined as (here B n (V ) denotes the ball of radius n in V ) One obviously has where, in the last inequality, we have used the fact that for every r ∈ D(β) = R the sequence {β λ (r)} λ converges from below to β 0 (r), where β 0 (r) is the unique element in β(r) such that |β 0 (r)| ≤ |y| for every y ∈ β(r) (note that the uniqueness of β 0 (r) follows from the maximal monotonicity of β). Thanks to estimate (5.2) we have, by Tonelli's theorem, therefore, integrating with respect to µ and taking the previous proposition into account, uniformly with respect to n. Since Φ n and Ψ n converge pointwise and monotonically from below to Φ and Ψ, respectively, the monotone convergence theorem yields hence, in particular, µ(V ) = µ(J) = 1. Similarly, note that β 1/n ∈ β((I + (1/n)β) −1 ) and 0 ∈ β(0) imply that |β 1/n | converges pointwise to |β 0 | monotonically from below as n → ∞, hence the same holds for the convergence of j * (β 1/n ) to j * (β 0 ) because j * is convex and continuous with j * (0) = 0. Therefore Ψ * n converges to Ψ pointwise monotonically from below as n → ∞.
Moreover, the lower semicontinuity of convex integrals implies that J n is closed in H for every n, hence Borel-measurable, so that J ∈ B(H). Let us show that, similarly, J * n is also closed in H for every n ∈ N: if (u k ) k ⊂ J * n and u k → u in H, then for every k there exists v k ∈ L 1 (D) with v k ∈ β(u k ) and Since j * is superlinear at infinity, this implies that the family (v k ) k is uniformly integrable in D, hence by the Dunford-Pettis theorem also weakly relatively compact in L 1 (D). Consequently, there is a subsequence (v ki ) i and v ∈ L 1 (D) such that v ki → v weakly in L 1 (D). The weak lower semicontinuity of convex integrals easily implies that Let us show that v ∈ β(u) almost everywhere in D: by definition of subdifferential, for every k ∈ N and for every measurable set E ⊆ D we have By Egorov's theorem, for any ε > 0 there exists a measurable set E ε ⊆ D with |E c ε | ≤ ε and u k → u uniformly in E ε . Taking E = E ε in the last inequality, letting k → ∞ we get which in turn implies by a classical localization argument that Hence, by the arbitrariness of ε, v ∈ β(u) almost everywhere in D, thus also u ∈ J * n . This implies that J * n is closed in H for every n, therefore also that J * ∈ B(H).
The estimates proved above implies that the set of ergodic invariant measures is not empty.
Theorem 5.5. There exists an ergodic invariant measure for the transition semigroup (P t ).
Proof. Recall that, as it follows by the Krein-Milman theorem, for a Markovian transition semigroup the set of ergodic invariant measures coincides with the extreme points of the set of all invariant measures (see, e.g., [1,Thm. 19.25]). Let I be the set of all invariant measures for P : by Theorem 5.1, we know that I is not empty and we need to show that I admits at least an extreme point. Let us prove that I is tight. By Theorem 5.3, we know that there exists a constant N such that Therefore, using the notation of the proof of Theorem 5.1, by Markov inequality as n → ∞. Hence I is tight, and thus admits extreme points.
Under a very mild growth condition on the drift one can also obtain uniqueness.
Theorem 5.6. If β is superlinear, i.e. if there exists c > 0 and δ > 0 such that then there exists a unique invariant measure µ for the transition semigroup P . Moreover, µ is strongly mixing.
Proof. For any x, y ∈ H, by Itô's formula, the monotonicity of A, the superlinearity of β, and Jensen inequality we have for a positive constantc. Denoting by y(·; y 0 ) the solution to the Cauchy problem one can easily check that c(t) := sup y0≥0 y(t; y 0 ) → 0 as t → ∞ and that c(t) ≥ 0 for every t ≥ 0. We deduce that E X(t; 0, x) − X(t; 0, y) 2 ≤ c(t) ∀t ≥ 0.
Let µ be an invariant measure for P . For any ϕ ∈ C 1 b (H) we have uniformly in x, and since C 1 b (H) is dense in L 2 (H, µ), we deduce that for any x ∈ H as t → ∞ for every ϕ ∈ L 2 (H, µ). We have thus shown that P admits a unique invariant measure, which is strongly mixing as well.

The Kolmogorov equation
Throughout this section we shall assume that β is a function, rather than just a graph. Let P = (P t ) t≥0 be the Markovian semigroup on B b (H) generated by the unique solution to (1.1), as in the previous section, and µ be an invariant measure for P . Then P extends to a strongly continuous linear semigroup of contractions on L p (H, µ) for every p ≥ 1. These extensions will all be denoted by the same symbol. Let −L be the infinitesimal generator in L 1 (H, µ) of P , and −L 0 be Kolmogorov operator formally associated to (1.1), i.e.
where f belongs to a class of sufficiently regular functions introduced below. Our aim is to characterize the "abstract" operator L as the closure of the "concrete" operator L 0 . Even though this will be achieved only in the case of additive noise, some intermediate results will be proved in the more general case of multiplicative noise. Let us first show that L 0 is a proper linear (unbounded) operator on L 1 (H, µ) with domain D)) . Here with the constant N independent of x and v ′ , and that x → Df (x) ∈ C(V ′ , V ). Analogously, f ∈ C 1 b (L 1 (D)) means that, for any x ∈ V and k ∈ L 1 (D), there is a constant N independent of x and k such that |Df (x)k| ≤ N k L 1 (D) and x → Df (x) ∈ C(L 1 (D), L ∞ (D)). For any f ∈ C 2 b (H), one has, recalling the linear growth condition on B, and recalling that x → j * (β(x)) L 1 (D) ∈ L 1 (H, µ) by Theorem 5.3, it is enough to consider the second term on the right-hand side: for any f ∈ C 1 b (L 1 (D)), sup x∈V Df (x) L ∞ (D) is finite, hence, recalling that j ∈ C(R), j(Df (x)) is bounded pointwise in D, thus also in L 1 (D), uniformly over x ∈ V . In particular, x → j(Df (x)) L 1 (D) ∈ L 1 (H, µ).
Let us now show that the infinitesimal generator −L restricted to D(L 0 ) coincides with the operator −L 0 defined above. Indeed, by Proposition 4.3, for every g ∈ D(L 0 ) we have from which we infer, taking expectations and applying Fubini's theorem, Since g ∈ D(L 0 ), we have that L 0 g ∈ L 1 (H, µ), as proved above. Therefore, recalling that P is strongly continuous on L 1 (H, µ), we have that t → P t L 0 g is continuous from [0, T ] to L 1 (H, µ). Hence, letting t → 0, we have which implies that L = L 0 on D(L 0 ). We are now going to construct a regularization of the operator L 0 . For any λ ∈ (0, 1), let be the Yosida approximation of β. Denoting a sequence of mollifiers on R by (ρ n ), the function β λn := β λ * ρ n is monotone and infinitely differentiable with all derivatives bounded. Let us consider the regularized equation Since β λn is Lipschitz-continuous, equation (6.1) admits a unique strong (variational) solution X x λn ∈ L 2 (Ω; E), where, as before, E := C([0, T ]; H) ∩ L 2 (0, T ; V ). Furthermore, the generator of the Markovian transition semigroup P λn = (P λn This follows arguing as in the case of L 0 (even using the simpler Itô formula of Proposition 4.2, rather than the one of Proposition 4.3).
Let us now consider the stationary Kolmogorov equation In view of the well-known relation between (Markovian) resolvents and transition semigroups, one is led to considering the function which is the natural candidate to solve (6.2). If we show that v λn ∈ C 1 b (V ′ ) ∩ C 2 b (H), then an application of Itô's formula (in the version of Proposition 4.2) shows that indeed v λn solves (6.2). We are going to obtain regularity properties of v λn via pathwise differentiability of the solution map x → X λn of the regularized stochastic equation (6.1). From now on we shall restrict our considerations to the case of additive noise, i.e. we assume that B ∈ L 2 (U, H) is non-random. Moreover, we shall assume that V is continuously embedded in L 4 (D). The latter assumption is needed to apply the second-order differentiability results of §4.2. We recall that, thanks to Theorems 4.4 and 4.7, the solution map x → X λn : H → L 2 (Ω; E) is Lipschitz continuous and twice Fréchet differentiable. Moreover, denoting its first order Fréchet differential by DX λn : H → L (H, L 2 (Ω; E)), for any h ∈ H the process Y h := (DX λn )h ∈ L 2 (Ω; E) satisfies the linear deterministic equation with random coefficients Similarly, denoting the second order Fréchet differential of x → X λn by for any h, k ∈ H the process Z hk := D 2 X λn (h, k) ∈ L 2 (Ω; E)) satisfies the linear deterministic equation with random coefficients We shall need the following result on the connection between variational and mild solutions in the deterministic setting. We recall that A 2 denotes the part of A on H.
). In particular, v ε is also a variational solution of the equation v ′ ε + Av ε = f ε , v ε (0) = u ε 0 . By construction we have that v ε → v in C([0, T ]; H); moreover, since f ε → f in L 2 (0, T ; H) and u ε 0 → u 0 in H, arguing as in the proof of Lemma 4.5 we have also that v ε → u in C([0, T ]; H) ∩ L 2 (0, T ; V ). Since mild and variational solutions are unique, we conclude that u = v. We shall now extend this argument to the case where u and v are the unique variational and mild solutions to the equations respectively. Setting f := F (·, v), the assumptions on F imply that f ∈ L 2 (0, T ; H), hence v is a mild solution to v ′ + A 2 v = f , v(0) = u 0 . It then follows by the previous argument that v is also the unique variational solution to v ′ + Av = f , v(0) = u 0 . Therefore in the variational sense. Using the integration-by-parts formula, the Lipschitz continuity of F , and Gronwall's inequality, it is then a standard matter to show that u = v.
The following estimates are crucial.
Let ω ∈ Ω ′ be fixed. Recalling that A is coercive and that β ′ λn is positive because β λn is increasing, taking the scalar product with Y h (t) in (6.3) and integrating in time yields for all t ∈ [0, T ], and the first estimate is thus proved. The second estimate follows directly from Proposition 4.7. Furthermore, denoting the Yosida approximation of the part of A in H by A ε , let Y x hε ∈ C([0, T ]; H) be the unique strong solution to the equation Let (σ k ) be a sequence of smooth increasing functions approximating pointwise the (maximal monotone) signum graph, and σ k be the primitive of σ k with σ k (0) = 0. Taking the scalar product of the previous equation with σ k (Y x hε ) and integrating in time we get, for every t > 0, Since, as k → ∞, σ k (Y x hε ) converges a.e. to a measurable function w ε ∈ sgn(Y x hε ) and σ → | · |, letting k → ∞ we get, for every t ≥ 0, Recalling that A 2 extends to an m-accretive operator on L 1 (D), the second term on the left-hand side is non-negative, and taking into account that Y x hε → Y x h in C([0, T ]; H) as ε → 0, the third inequality follows. Finally, since Y h is the unique variational solution to (6.3), by Lemma 6.1 we have that Y h is also mild solution to the same equation, i.e.
Recall that −A generates an analytic semigroup on V ′ extending S, denoted by the same symbol.
Since H = D((ηI + A) δ ), we have S(t)h t −δ h V ′ for every t > 0. By the contractivity of S in H we also have, for every t > 0, Therefore we have, for every t ∈ [0, 1], as well as, for every t ≥ 1, which implies the last estimate. D)). For every n ∈ N and λ ∈ (0, 1), the function v λn : H → R defined as for all n ∈ N and λ ∈ (0, 1).
Proof. Since g ∈ C 1 b (H), for any h ∈ H we have, by the first estimate of Proposition 6.2, hence, by the dominated convergence theorem, v λn ∈ C 1 b (H) and The uniform boundedness of v λn C 1 b (H) in λ and n follows directly from these computations. Similarly, using the fact that g ∈ C 2 b (H) and the second estimate of Proposition 6.2, we have, for every k ∈ H, h k , hence, by the dominated convergence theorem, v λn ∈ C 2 b (H) and D 2 v λn (x)(h, k) = E +∞ 0 e −αt D 2 g(X x λn (t))Y x h (t)Y x k (t) + Dg(X x λn (t))Z x hk (t) dt. (6.7) Moreover, using the third estimate of Proposition 6.2 and the fact that g ∈ C 1 b (L 1 (D)), it follows by Hölder's inequality and (6.6) that which implies that v λn ∈ C 1 b (L 1 (D)) and the estimate (6.5). Finally, by the last estimate of Proposition 6.2 and the fact that g ∈ C 1 b (V ′ ), we have Since t → (1 ∨ t −δ )e −αt belongs to L 1 (0, +∞), we have Dv λn (x)h λ,n h V ′ , thus also v λn ∈ C 1 b (V ′ ). Let us show now that v λn solves (6.2). Indeed, since g ∈ C 2 b (H) ∩ C 1 b (V ′ ), by Itô's formula in the version of Proposition 4.2 we get g(X x λn (t)) + t 0 AX x λn (s), Dg(X x λn (s)) ds + t 0 β λn (X x λn (s)), Dg(X x λn (s)) ds = g(x) + 1 2 t 0 Tr[B * (X x λn (s))D 2 g(X x λn (s))B(X x λn (s))] ds Dg(X x λn (s))B(X x λn (s)) dW (s) for every t > 0. Thanks to the boundedness of Dg, taking expectations and using Fubini's theorem we deduce that, for every α > 0 and x ∈ V , e −αt E g(X x λn (t)) + α E t 0 e −αs g(X x λn (s)) ds − t 0 P λn s L λn 0 g(x) ds = g(x).
Since g ∈ C b (H), it is clear that, as t → +∞, the first and second term on the left-hand side converge to zero and αv λn (x), respectively, hence, by difference, we deduce that t 0 P λn s L λn 0 g(x) → +∞ 0 P λn s L λn 0 g(x) ds.
which belongs to L 1 (D) for every x ∈ J * . Therefore, by the dominated convergence theorem, We are now in the position to state and prove the main result of this section, that gives a positive answer to the problem of L 1 -uniqueness for the Kolmogorov operator L 0 . The question is whether the extension to L 1 (H, µ) of the transition semigroup P , generated by the solution to the stochastic equation (1.1), is the only strongly continuous semigroup on L 1 (H, µ) whose infinitesimal generator is an extension of the Kolmogorov operator L 0 . Recall that, apart of the standing assumptions of §2, we are also assuming that β is a function, B is non-random and does not depend on the unknown, V is continuously embedded in L 4 (D), and H is the domain of a fractional power of (a shift of) A, seen as the negative generator of an analytic semigroup in V ′ . Theorem 6.5. The generator L of the extension to L 1 (H, µ) of the transition semigroup P is the closure of L 0 in L 1 (H, µ).
Proof. Since the extension of the transition semigroup P to L 1 (H, µ) is contractive, it follows by the Lumer-Phillips theorem that L is m-accretive. As L coincides with L 0 on D(L 0 ), this implies that L 0 is accretive in L 1 (H, µ), hence, in particular, closable. We are going to show that the image of αI + L 0 is dense in L 1 (H, µ) for all α > 0. Let f ∈ L 1 (H, µ) and ε > 0. Since D(L 0 ) is dense in L 1 (H, µ), there exists g ∈ D(L 0 ) such that f − g L 1 (H,µ) < ε/2. Setting, for any n ∈ N and λ ∈ (0, 1), v λn (x) := ∞ 0 e −αt E g(X x λn (t)) dt, if follows by Lemma 6.3 that v λn ∈ D(L 0 ) and that αv λn (x) + L λn 0 v λn (x) = g(x) for every x ∈ V ∩ J ∩ J * , hence also Thanks to Lemma 6.4, there exist λ 0 > 0 and n 0 ∈ N such that As ε > 0 was arbitrary, it follows that the image of αI + L 0 is dense in L 1 (H, µ). Since L 0 is closable, the Lumer-Phillips theorem implies that −L 0 , the closure of −L 0 in L 1 (H, µ), generates a strongly continuous semigroup of contractions in L 1 (H, µ). Recalling that L is an extension of L 0 , it follows again by the Lumer-Phillips theorem that L = L 0 (see, for instance, [8,Theorem 1.12]).