Existence of martingale solutions and large-time behavior for a stochastic mean curvature flow of graphs

We are concerned with a stochastic mean curvature flow of graphs over a periodic domain of any space dimension. We establish existence of martingale solutions which are strong in the PDE sense and study their large-time behavior. Our analysis is based on a viscous approximation and new global bounds, namely, an $L^{\infty}_{\omega,x,t}$ estimate for the gradient and an $L^{2}_{\omega,x,t}$ bound for the Hessian. The proof makes essential use of the delicate interplay between the deterministic mean curvature part and the stochastic perturbation, which permits to show that certain gradient-dependent energies are supermartingales. Our energy bounds in particular imply that solutions become asymptotically spatially homogeneous and approach a Brownian motion perturbed by a random constant.


Introduction
The mean curvature flow (MCF) of hypersurfaces is one key example of a geometric evolution law and is of major importance both for applications and for the mathematical theory of surface evolution equations, see for example [57], [17], [42] or [4] and the references therein.
Given a family (Γ(t)) t>0 of smooth n-dimensional hypersurfaces in R n+1 mean curvature motion is characterized by the evolution law where V describes the velocity in direction of a fixed smooth normal field ν and H denotes the mean curvature with respect to the same normal field (in our notation H is given by the sum of the principle curvatures).
The motion by mean curvature has attracted much attention. It is the simplest gradient flow dynamic of the surface area energy, that is a relevant energy in numerous applications. There are several analogies to the heat equation, as can be seen in the distance function formulation of MCF (see for example [4]) or the approximation by mean curvature flow for nearly flat graphs. One of the consequences is that a comparison (or inclusion) principle holds and that convexity is conserved. On the other hand, MCF is a nonlinear evolution, governed by a degenerate quasilinear elliptic operator. This in particular leads to the possibility that singularities appear in finite time and that the topology changes. For example, balls shrink in finite time to points and for certain dumbbell type initial shapes a pinch-off of components happens. Such challenges have been the origin and motivation for several important developments in geometric analysis, starting with the pioneering work of Brakke [9] on geometric measure theory approaches, level set methods as developed by Evans and Spruck [21,22,23,24] and Chen, Giga, Goto [13], De Giorgi's barrier method [6,5] or time discrete approximations as introduced by Luckhaus and Sturzenhecker [41] and Almgren, Taylor and Wang [1].
The formation of singularities on the other hand can be excluded in particular situations such as the evolution of entire graphs, where solutions exist globally in time [19] or for initial data given by compact, smooth and convex hypersurfaces [31]. In the latter case the surfaces become round and shrink to a point in finite time.
Several of the techniques developed for mean curvature flow have been successfully applied to deterministic perturbations of the flow [13,2,3,12,43] that are present in a number of applications. A random forcing was included to mean curvature flow in [34] to account for thermal fluctuations. In this paper we study a particular stochastic perturbation in the case of hypersurfaces given as graphs over the n-dimensional flat torus. To motivate the equation let us start from the general case of a random evolution (Γ(t)) t>0 of surfaces in R n+1 that are given by immersions φ t : Γ → R n+1 of a fixed smooth manifold Γ. We then consider a real-valued Wiener process W defined on some probability space (Ω, F , P) and the stochastic differential equation (1) dφ t (x) = ν(x, t) H(x, t) dt + • dW (t) , which is possibly the simplest stochastic perturbation, by a one dimensional white noise acting uniformly in all points of the surface in normal direction. If we further restrict ourselves to the case of graphs over the flat torus T n (represented by the unit cube and periodic boundary conditions), that is, Γ(ω, t) = graph u(ω, ·, t) = {(x, u(ω, x, t)) ∈ R n+1 | x ∈ T n } for a (random) function u : Ω × T n × (0, ∞) → R, we are lead to the following Stratonovich differential equation where Q(∇u) denotes the area element and v(∇u) the horizontal projection of the normal to the graph The choice of the Stratonovich instead of an Itô differential in (2) is necessary to keep the geometric character of the equation, see the discussion in [40]. Despite its origin from a rather simple stochastic forcing, the evolution equation for the graphs presents severe difficulties. In particular, the presence of a multiplicative noise with nonlinear gradient dependence in combination with the degeneracy in the quasilinear elliptic term are challenges for a rigorous analysis and it is at first place not clear whether or not solutions stay graphs. The deterministic mean curvature flow for graphs was considered in [18], where an a priori gradient bound was proved and the long-time behavior was analyzed, see [30] for graphs over a given domain with vertical contact angle. Lions and Souganidis presented a general well-posedness theory and introduced a notion of stochastic viscosity solutions [36,37,38,39] for geometric equations of mean curvature flow type (and beyond), but no regularity properties other than continuity are obtained for the solutions. The evolution (1) for the case n = 1 was investigated by Souganidis and Yip [53] and Dirr, Luckhaus and Novaga [16], where a stochastic selection principle was identified in situations where non-uniqueness appears for the deterministic flow. In [16] also an existence result was proved, but only for short time intervals determined by a random variable that is not necessarily bounded from below. Other (formal) approximations of stochastically perturbed mean curvature flow equations have been studied, such as a time discrete scheme in [56] and stochastic Allen-Cahn equations in [25,50,55,7,8].
The Stratonovich differential equation (2) was already considered in the case n = 1 in [20] and, mainly for n = 2 in [28], by von Renesse and the second and third author. The present paper continues and extends these results in several respects. The most important new contribution is a uniform (i.e. L ∞ with respect to all the three variables ω, x, t) gradient bound for u and an L 2 -bound for the Hessian in arbitrary dimensions. This is a major improvement compared to [28] where only H 1 -estimates for u and an L 2 -estimate for the mean curvature were shown. Our gradient bound in particular shows that a solution stays a graph for all times. More precisely, Lipschitz continuity of the initial condition is preserved during the evolution. As a consequence of our improved bounds we are able to prove the existence of martingale solutions that are strong in the PDE sense for any space dimension. In contrast, in [28] the existence result was restricted to two dimensions and the solutions were only weak in the PDE sense.
Our proof of the gradient bound uses a Bernstein type argument [27, Section 14.1] but in a context of energy methods, which seems to be new even for deterministic mean curvature flow equations. In the deterministic case this argument reduces to an argument which is similar to the way the gradient bounds in [18] are derived from Huisken's weighted monotonicity formula, but instead of the backward heat kernel a constant kernel is used.
Especially the L ∞ -gradient bound and in particular its uniformity with respect to the randomness variable ω may appear somewhat surprising in the field of SPDEs. It is a consequence of the geometrical nature of the model and more precisely of the fact that the structure of the noise respects the underlying deterministic evolution. This is reflected through our energy-type estimates: by exploring the precise structure of all the involved quantities we are able to group them in such a way that each term can be shown to be non-positive and additionally yields a control of second derivatives. The identification of the non-positive terms makes use of the interplay between the deterministic mean curvature part of the equation and the stochastic perturbation. Moreover, we are also able to study the large-time behavior of solutions and prove that solutions become homogeneous in space and asymptotically only deviate from a constant value by a Wiener process. This result improves the results of [20] by obtaining a stronger convergence and extending it to arbitrary dimensions.
In contrast to [28] we will use the abstract theory of variational SPDEs [48] to handle equation (2). Although (2) itself only has a variational structure for n = 1, which was exploited in [20], the gradient of a solution will indeed solve a variational SPDE for arbitrary dimensions. Since (2) lacks coercivity we will approximate it for ε > 0 by which is coercive in an appropriate sense. We will call (5) the viscous equation.
Since the viscous equation is not covered by the classical theory for variational SPDEs [48,26], we include an Itô formula and an abstract existence result for a large class of equations in Appendix A. These results, which hold independent interest themselves, are generalizations of the results from the pioneering works of Pardoux [45] and Viot [54].
For a precise formulation of our main results and an overview over the main techniques of the proofs see Section 3 below.
We note that under our assumptions on the initial condition the stochastic viscosity theory à la Lions, Souganidis [36,37,38,39] yields the existence of a unique viscosity solution. Proving the coincidence of our solution with the viscosity solution seems a major challenge and out of reach at the moment. Comparing the two notions, our solutions have better regularity properties implying not only spacetime Hölder continuity but in addition L 2 -regularity of second order derivatives in space. In particular the mean curvature operator is well-defined in a pointwise a.e. sense. Furthermore, we are able to characterize the large-time behavior. On the other hand, proving uniqueness for our solutions (which is necessary and most likely also a major tool to obtain the equivalence of the concepts) remains open. For our solutions an energy based approach to uniqueness seems most appropriate but to require even higher regularity of solutions and a control of the evolution of quantities like the normal vectors or the surface area measure. This paper is organized as follows: After explaining the notation in Section 2 we present our results in Section 3. Existence of solutions of the viscous equation will be established in Section 4. In Section 5 we prove similarly to Huisken's monotonicity formula that certain energies are non-increasing uniformly in ε. We apply this to deduce uniform H 2 and uniform L ∞ gradient bounds for solutions of the viscous equation. In Section 6 we prove that solutions of (5) converge to a solution of (2), which in particular proves that there exists a solution. The large-time behavior of a solution is analyzed in Section 7.
We present the theory of variational SPDEs in spaces with compact embedding in Appendix A.

Notation
In this section we introduce the basic notation used throughout the paper.
Hilbert-Schmidt operators. Let U, H be two separable Hilbert spaces and (g k ) k an orthonormal basis of U . With L 2 (U ; H) we will denote the space of all Hilbert-Schmidt operators T : U → H with the norm T 2 L2(U;H) := k T g k 2 H , which is independent of the choice of the orthonormal basis.
With L 1 (H) we will denote the space of all nuclear operators T : H → H with the norm It is well known that (L 1 (H)) * = L(H) and that the weak- * topology on L(H) coincides on norm bounded subsets with the weak operator topology on L(H), which is the weakest topology such that for all x, y ∈ H the map L(H) → R, T → T x, y H is continuous.
Furthermore, for a Banach space E we will use the notation (E, w) resp. (E ′ , w * ) to denote the space E with the weak topology resp. the dual space E ′ with the weak- * topology.
Stochastic processes. For an interval I = [0, T ] with T > 0 or I = [0, ∞), a stochastic basis (Ω, F , (F t ) t∈I , P) consists of a probability space (Ω; F , P) together with a filtration (F t ) t∈I . According to [14] the filtration (F t ) t will be called a normal filtration, if • A ∈ F 0 for all A ∈ F with P(A) = 0 and • for all t ∈ I with t < sup I we have that For such an (F t ) t -Wiener process W on a separable Hilbert space U with covariance operator Q ∈ L(U ), that we always assume to be positive definite, one can define the H is another separable Hilbert space and Φ is a predictable L 0 2 := L 2 (U 0 ; H)-valued process with is a well-defined local martingale with values in H.
Stratonovich integral. In the situation above, it is sometimes more natural to consider the stochastic Stratonovich integral which, however, might not be well-defined.
If at least formally one has the evolution law with an L 0 2 -valued process µ and an L 2 (U 0 ; L 0 2 ) = L 2 (U 0 × U 0 ; H)-valued process σ, then formally one has for all t ∈ [0, T ], with (g k ) k an orthonormal basis of U . The value on the right hand side does not depend on the choice of (g k ) k . Whenever the right hand side of (6) is well-defined, we can think of it as the definition for the Stratonovich integral on the left hand side of (6).
Periodic Sobolev spaces. For k ≥ 0, p ∈ [1, ∞] we will denote with W k,p (T n ) the space of periodic Sobolev functions on the flat torus T n , which can be identified with the completion of the space of [0, 1] n periodic C ∞ (R n ) functions with respect to the · W k,p ([0,1] n ) norm.
Matrix scalar product. For matrices A, B, C, D ∈ R n×n we will write We will use the convention that

Results
In this section we will state the main results of this paper. The proofs are given in the subsequent sections. We will first formulate our solution concept. We are concerned with solutions that are strong in the PDE sense, that is, an integral form of (2) is satisfied pointwise. In addition, they may be either strong or weak in the probabilistic sense, depending on whether the underlying probabilistic elements are given in advance or not. (i) Let I = [0, ∞) or I = [0, T ] with T > 0, (Ω, F , (F t ) t∈I , P) be a stochastic basis with a normal filtration together with a real-valued (F t )-Wiener process W and u 0 ∈ L 2 (Ω; H 1 (T n )) be F 0 -measurable. A predictable H 2 (T n )valued process u with u ∈ L 2 (Ω; L 2 (0, t; H 2 (T n ))) for all t ∈ I is a strong solution of (2) with initial data u 0 , if (ii) Let Λ be a Borel probability measure on H 1 (T n ) with bounded second A martingale solution of (2) with initial data Λ is given by (Ω, F , (F t ) t∈I , P) together with W , u 0 and u such that (i) is satisfied and P • u −1 0 = Λ. In the same way we can define strong solutions and martingale solutions for (5).
In the following we will often just write that u is a strong solution instead of specifying that u is a strong solution for a time interval I with respect to a stochastic basis with a normal filtration and a real-valued Wiener process. If not otherwise specified the stochastic basis will be denoted by (Ω, F , (F t ) t∈I , P) and the Wiener process by W . Remark 3.2. Note that formally for a solution u of (2) one can use the chain rule, which holds true for the Stratonovich integral, to deduce that Hence, according to Section 2 the Itô-Stratonovich correction for the integral in Definition 3.1 is given by and the Stratonovich integral in Definition 3.1 has to be understood in the sense that such that the equation in Definition 3.1 becomes Once this stochastic basis and the corresponding Wiener process are found, the martingale solution is a strong solution with respect to this particular choice of stochastic basis and Wiener process.
Remark 3.4. From Corollary A.5 we infer that a strong solution of (2) or (5) has a modification with continuous paths in H 1 (T n ) and u ∈ L 2 (Ω; C([0, t]; H 1 (T n ))) for all t ∈ I. Furthermore, under suitable assumptions on the initial data we deduce that u ∈ C([0, t]; C(T n )) P-a.s. for all t ∈ I, see Remark 6.1 below.
We are now ready to state the main result of the present paper.
for some constant L > 0.
Our next main result shows that solutions become spatially constant for t → ∞. Then there is a real-valued random variable α such that Remark 3.7. We will deduce existence of solutions (u ε ) ε>0 of the viscous equation (5) using the abstract theory of variational SPDEs presented in Appendix A. The fact that (5) can be treated as a coercive equation already yields estimates for the Dirichlet energy of solutions.
In Section 5 we will extend these arguments to prove more general a priori estimates for solutions which are uniform in ε > 0. For this we will make use of a generalization of the classical Itô formula to prove that certain gradient-dependent energies are non-increasing for solutions in a stochastic sense, i.e. they are supermartingales. In the deterministic case one can use Huisken's monotonicity formula to get similar results. With the stochastic perturbation, Huisken's monotonicity formula does not hold because the time-derivative of these energies contains additional Itô-Stratonovich correction terms that are difficult to control. For our gradient-dependent energies we use integration by parts to prove that these correction terms together with terms stemming from the deterministic motion have a good sign. We will apply this result to deduce estimates for the Dirichlet energy in Proposition 5.1 and a maximum principle for the gradient in Proposition 5.2.
With our uniform Lipschitz bounds at hand and Proposition 5.1 we deduce that (2) is coercive and this yields H 2 bounds for (u ε ). Furthermore we derive tightness of their probability laws in appropriate spaces and with the Jakubowski-Skorokhod representation we can deduce that the approximate solutions converge in a weak sense. We then identify the limit in Section 6.
The a priori estimates derived for the solution are also one key to analyze the large-time behavior of solutions.

Existence of viscous approximation
We will use the theory presented in Appendix A to prove existence for a viscous approximation (5) of the stochastic mean curvature flow. The key observation is that the variational framework shall be applied to the equation for ∇u, see (7) below, rather than directly to (5). This is further made possible by the structure of (5) and in particular by the fact that only the gradient of the solution appears on the right hand side of (5).
Theorem 4.1. Let ε > 0, q > 2 and Λ be a Borel probability measure on H 1 (T n ) with Then there is a martingale solution u of (5) for I = [0, ∞) with initial data Λ.
Proof of Theorem 4.1. We intend to apply Theorem A.7 in order to obtain a martingale solution to the equation the gradient ∇u fulfills for u satisfying (5), which in turn yields a martingale solution to (5) itself. To this end, we will work with the spaces We have that V ⊂ H densely and compactly. Furthermore we can identify L 2 (U ; H) = H.
We define the operators We verify that the Assumptions A.6 are fulfilled: • Coercivity: Using integration by parts and the fact that the boundary terms vanish because of the periodic domain we obtain . Note that we have used the non-negativity of in the second to last inequality and the periodic boundary conditions as well as a Poincaré inequality for mean-free vector fields in the last inequality. • Growth bounds: We have The other terms in the definition of A(u k ) are linear in u k , hence Now, from Theorem A.7 we can conclude that there is a martingale solution ∇u of with a real-valued Brownian motion W . Next we will show that (7) is also fulfilled in H −1 (T n ; R n ), hence weak in the PDE sense. For an arbitrary ψ ∈ H 1 (T n ; R n ) we take the Helmholtz decomposition ψ = ∇w + φ with w ∈ H 2 (T n ) and φ ∈ H 1 (T n ; R n ) with ∇·φ = 0 and since both sides of the equation for ∇u are orthogonal to divergence-free vector fields, we have and therefore the equation for ∇u is also fulfilled in Note that by assumption u 0 ∈ L 2 (Ω; L 2 (T n )) and also for T ∈ [0, ∞) Hence,ũ ∈ L 2 (Ω; L 2 (0, T ; L 2 (T n ))). Furthermore and by (8)ũ is a martingale solution of (5).

A priori estimates
In this section we will prove a priori energy estimates for solutions of the viscous equation (5) which are uniformly in ε > 0 and also hold true for solutions of the SMCF equation (2). The first proposition basically says that the Dirichlet energy of solutions is decreasing and extends the coercivity proven in Section 4.
Proposition 5.1 (Weak coercivity). Let ε ≥ 0 and u be a strong solution of (5). Then the energy T n |∇u| 2 is a supermartingale.
Furthermore, we can quantify the decay by

If in addition
In the next proposition we prove that the additional assumptions from Proposition 5.1 can be verified if the Lipschitz constant of the initial condition is uniformly bounded.
Proposition 5.2 (Maximum principle for the gradient of solutions). Let ε ≥ 0 and u be a strong solution of (5).   (5). For a function f ∈ C 2 (R n ) with bounded second order derivatives and Proof. To abbreviate the calculations we will write Q := Q(∇u) and v := v(∇u). With this notation we have ∇Q = ∇ Q(∇u) = D 2 uv(∇u) = D 2 uv. We can apply Corollary A.5 to infer The term µ viscous corresponds to the time derivative of I along solutions of the heat equation. It is weighted with ε because it appears due to the additional viscosity added to the equation. The term µ mcf corresponds to the time derivative of I along solutions of the unperturbed mean curvature flow of graphs. It is weighted with the factor 1 2 because the other part has to be used in µ pert to handle the additional terms coming from the perturbation. We handle µ viscous , µ mcf and µ pert separately using partial integrations and the periodicity of the functions. For µ viscous we calculate For µ mcf we calculate For µ pert we calculate Inserting these calculations into (9) yields the result.
We will next explore for which choices of f Lemma 5.3 yields a control on appropriate quantities. We therefore choose f as a function of Q(∇u), which gives more geometric meaning to the estimates and is still sufficient to obtain the required estimates, see the remarks below.
(i) Note that all terms on the left hand side in Lemma 5.4 are non-negative.
and Proof of Lemma 5.4. Let f (p) := g(Q(p)) for p ∈ R n . Then Since g ′′ is bounded we infer that g ′ grows at most linearly and therefore D 2 f is bounded. Furthermore, we calculate Note that Now the eigenvalues of (10) are given by which shows the non-negativity of (10). We will again use the notation Q = Q(∇u) and v = v(∇u). We can apply Lemma 5.3 to I(t) and deduce : Because of the non-negativity of (10) and Proposition B.1, I is a non-negative local supermartingale. We can apply Fatou's Lemma to get rid of the locality and deduce that Now, for q ∈ [1, 2] we want to use the Itô formula for the function x → |x| q . This function is not twice continuously differentiable for q < 2, so the classical Itô formula does not apply directly. Nevertheless, we can first do the calculations for ϑ > 0 and the function x → (ϑ + x) q which is twice continuously differentiable on [0, ∞) and then send ϑ → 0. We infer As before, since the stochastic integral defines a local martingale and using Fatou's lemma, we get For the stochastic integral we can apply the Burkholder-Davis-Gundy inequality After Lemma 5.4 has been established we can apply it to prove Proposition 5.1 and Proposition 5.2.
Note that d T n |∇u| 2 = d T n g(Q(∇u)). Then by Lemma 5.4 for q = 1, As the next step, we establish the maximum principle.
s. for all t ∈ I. Therefore ∇u L ∞ (I;L ∞ (T n )) ≤ L.
Remark 5.6. We can also use Lemma 5.4 for g(r) = r to deduce bounds for the q-th moment of the area. In particular for q = 1 and ε = 0 we get In geometrical terms (11) becomes where Γ(t) = graph u(t), H(t) is the mean curvature and |A(t)| is the length of the second fundamental form of Γ(t) for t ∈ I. Compare this with the deterministic MCF, where for a solution (Γ(t)) t≥0 the natural energy identity is H 2 (s) dH n ds = H n Γ(0) .
However, we will not use this estimate since an L ∞ bound for the gradient and an L 2 bound for the Hessian are available via Proposition 5.2 and Proposition 5.1.

Vanishing viscosity limit
With the above uniform estimates at hand, we are in position to pass to the limit as ε → 0 and establish the existence of a martingale solution to the stochastic mean curvature flow (2).
In the remaining part of the proof we will show that these bounds imply the existence of a convergent subsequence in a weak sense and we will identify the limit with a solution of (2). We will follow the same strategy in the proof of Theorem A.7, where we pass from the finite-dimensional approximations to a solution. Since the line of arguments is very similar but slightly more involved in the case of Theorem A.7 we give a detailed proof only for the latter Theorem and here just comment on the main ideas and on differences between both proofs.
Using the compactness of the embeddings, the joint laws of (u ε , W ε ) are tight in Since T > 0 is arbitrary this also implies the tightness in X u × X W with Now we can argue via the Jakubowski-Skorokhod representation theorem for tight sequences in nonmetric spaces [33, Theorem 2] to deduce the existence of a subsequence ε k ց 0, a probability space (Ω,F ,P) and X u × X W -valued random variables (ũ k ,W k ) for k ∈ N and (ũ,W ) such thatũ k →ũ a.s. in X u ,W k →W a.s. in X W and the joint laws of (ũ k ,W k ) agree with the joint laws of (u ε k , W ε k ) for k ∈ N. LetF One can prove thatW k is a real-valued (F k t ) t -Wiener process andũ k is a solution of (5) for ε k and the Wiener processW k .
With the a.s. convergences in X W resp. X u and the uniform bounds derived before one can pass to the limit in the equations and infer thatW is a real-valued (F t ) t -Wiener process andũ is a solution of (2). In opposite to the proof of Theorem A.7 the operator in the deterministic part of the equation changes, but the convergence u ε ⇀ u in H 2 (T n )) implies which is enough to pass to the limit in the equation. Because of the uniform bounds of (u ε ) in L 2 (Ω; L 2 (0, T ; H 2 (T n ))) for all T > 0 we know that the limitũ is already a martingale solution.
For a function w ∈ L 1 (T n ) with ∇w ∈ L ∞ (T n ) we have w ∈ W 1,∞ (T n ) with Hence, u ∈ L ∞ (0, t; W 1,∞ (T n )) P-a.s. for all t ∈ I. From the proof of Theorem 3.5 we deduce that u ∈ L 2 (Ω; C 0,λ ([0, t]; L 2 (T n ))) for all t ∈ I and some λ > 0. In combination with the previous result and [52, Theorem 5] this yields u ∈ C([0, t]; C(T n )) P-a.s. for all t ∈ I. Using sharper interpolation results one can prove that the solution is pathwise Hölder continuous in space and time.

Large-time behavior
In this section, we study the large-time behavior of solutions to (2).
For the convergence as T → ∞ we note that by Corollary A.5 To bound the drift we estimate where we have used |v(p)| ≤ min{|p|, 1} and a Poincaré inequality, and infer Furthermore we have Q(p) − 1 ≤ |p| and therefore for the martingale part of α the bound hence α ∈ L 1 (Ω) is a well-defined random variable. We find a sequence (t k ) k∈N of increasing times t k → ∞ such that E D 2 u(t k ) 2 L 2 (T n ) → 0 for k → ∞.

Now, we apply a Poincaré inequality to obtain
From Proposition 5.1 we infer that ∇u 2 L 2 (T n ) is a non-negative supermartingale and For the second term in (14) we have with the Burkholder-Davis-Gundy inequality and the estimates from above From Theorem 3.6 we can deduce the next corollary which extends the onedimensional result from [20, Theorem 4.2] to higher dimensions. Furthermore it improves the convergence in distribution in C loc ([0, ∞); L 2 (T n )) to convergence in L 1 (Ω; C b ([0, ∞), H 1 (T n ))).
Corollary 7.1. Let u be a strong solution of (2) with Then for T → ∞ we have

Appendix A. Variational SPDE under a compactness assumption
In this section we will consider infinite-dimensional stochastic differential equations with a variational structure. In Appendix A.1 we will prove an Itô formula for this kind of equation, which will be used in Appendix A.2 to show existence for variational SPDEs. During the whole section we will work with the following assumptions.
Assumptions A.1. Let V and H be separable Hilbert spaces with V ⊂ H ≃ H ′ ⊂ V ′ and V densely and compactly embedded in H. Furthermore we will consider another separable Hilbert space U , which will be the space where a Wiener process is defined. For notational convenience we will restrict the presentation to the case of infinite-dimensional spaces, although finite-dimensional spaces could be treated as well.
Then we can find an orthonormal basis (e k ) k∈N of H which is an orthogonal basis of V and we will use the abbreviation λ k := e k 2 V for k ∈ N. We will assume that the (e k ) k are arranged such that (λ k ) k is a non-decreasing sequence. Furthermore we will denote by (g l ) l∈N an orthonormal basis of U .
If not otherwise specified then a cylindrical Wiener process W on U with respect to a filtration (F t ) t will always be assumed to have the representation Note that this generalization is similar to the result presented [45, II.II. §4], where it was proven under slightly different assumptions, e.g. the embedding V ⊂ H is not assumed to be compact and V is only a separable Banach space, which is uniformly smooth and convex, but F is assumed to be twice Fréchet differentiable. For the readers convenience we will discuss the main steps of a different proof here, which makes use of the stronger assumptions compared to [45] and adapts the proof of [15,Proposition A.1], where H and V are assumed to be Sobolev spaces. In (16) below, we define a smoothing operator which in fact is the semigroup generated by −J V with J V : V → V ′ given by the canonical identification of the Hilbert space V with its dual space. If V = H 1 0 (T n ), H = L 2 (T n ), then this is the classical heat semigroup.
Step 2: u takes values in H We will apply the Itô formula [14,Theorem 4.32] for the function · 2 H . This gives Using the Burkholder-Davis-Gundy inequality and the assumptions on u, v and Step 3: Proving the Itô formula.

Now, in Lemma A.4 it is verified that
uniformly continuous on bounded subsets of V . We apply the Itô formula [14,Theorem 4.32] to conclude that Because of the assumptions on F and an infinite dimensional version of the dominated convergence theorem for stochastic integrals [49, Theorem IV.32] we can pass to the limit ε → 0 on both sides of this equation.
Hence, F (u) has a continuous version for which Step 4: u has a continuous version. We infer from the calculations above that there is a version of u such that u 2 H is continuous and From (15) and (17)  Proof. We only have to prove the continuity of D 2 F | V : V → L(V ; V ′ ) and the uniform continuity on bounded subsets of V .
The compactness of the embeddings V ⊂ H ≃ H ′ ⊂ V ′ implies that the embedding L(H) ⊂ L(V ; V ′ ) is compact. Thus, when u k ⇀ u in V then u k → u in H and by the assumptions from Proposition A.3 we infer We will apply Proposition A.3 to the appropriate spaces for (2).
Proof. We consider the spaces V = H 2 (T n ) and H = H 1 (T n ). To work in the framework from above we have to do the rather unusual identification of w ∈ H 1 (T n ) which is an equation for J H u in V ′ . We consider the function G : Since F ∈ C 2 it easy to check that G ∈ C 1 (H 1 (T n )) and that the second Gâteaux derivative D 2 G exists. We calculate for w, ϕ, ψ ∈ H 1 (T n ) We have that G and DG are bounded on bounded subsets of H 1 (T n ) and that D 2 G is bounded because of the bounds of the second derivatives of F . On bounded subsets of L(H) = L 1 (H) * the weak- * topology is equivalent to the weak operator topology and therefore the continuity of D 2 G : H → (L(H), w * ) follows from the fact that for all w k → w in H 1 (T n ) and all ϕ, ψ ∈ H 1 (T n ) we have Because of the assumptions on F we find that Φ(w) ∈ L 2 (T n ) for w ∈ H 2 (T n ) and Φ : H 2 (T n ) → L 2 (T n ) is continuous. Since for the restriction of J H to H 2 (T n ) we have that J H | H 2 (T n ) : Note that for the application of Proposition A.3 we shall have an equation for du in V ′ , whereas (19) is an equation for du in L 2 (T n ). Therefore we have to use (20) to infer that a.s. for all t ∈ [0, T ] A.2. Existence for variational SPDEs. We will adapt the approach of [47, Section 2.3.3] to prove existence of weak solutions for variational SPDEs which goes back to [54].
In addition to Assumptions A.1 we will make the following assumptions.
Assumptions A.6. Let A : V → V ′ and B : V → L 2 (U ; H). We will write B * : V → L 2 (H; U ) for the adjoint operator B * (u) := B(u) * . We assume: • Coercivity: There are constants α, C > 0 such that • Growth bounds: There is a constant C > 0 and δ ∈ (0, 2] such that • Continuity: A : V → V ′ is weak-weak- * sequentially continuous, that means and B * : V → L 2 (H; U ) is sequentially continuous from the weak topology on V to the strong operator topology on L(H; U ), that means The assumptions (22), (23) and (26) are the same as in [47], whereas (27) is weaker. Furthermore we have replaced the sublinear growth bound from [47] for B(u) by the weaker assumptions (24) and (25). These weaker assumptions are necessary to apply the theory to the viscous equation (5). To prove this generalization we have to prove bounds for higher moments of the · H norm of the approximations, whereas in the proof in [47] only the second moment of the · H norm needed to bounded. This will be done in Proposition A.9 under the additional assumption that the corresponding higher moment of the · H norm is bounded for the initial data. Similarly to the ideas of [28], we will use the Jakubowski-Skorokhod representation theorem [33] for tight sequences in non-metric spaces to prove that our approximations converge on a different probability space. We will make use of similar arguments as in [10] to handle the unbounded time interval. Finally, we will show that this limit is a martingale solution of (21) using a general method of constructing martingale solutions without relying on any kind of martingale representation theorem, which was introduced in [11] and already used in [44] and [28], among others.
We will use a standard Galerkin scheme (compare with [47, Chapter 2.3]) to prove that there is a martingale solution of (21) if the initial condition has bounded q-th moment in H for some q > 2. With the (e k ) k∈N as in Assumptions A.1 we will write Our main result is: Theorem A.7. Let q > 2 and Λ be a Borel probability measure on H with finite q-th moment Then there is a martingale solution of (21) with initial data Λ. That means, that there is a stochastic basis (Ω, F , (F t ) t∈[0,∞) , P) with a normal filtration, a cylindrical (F t )-Wiener process W on U and a predictable u with u ∈ L 2 (Ω; L 2 (0, T ; V )) ∩ L 2 (Ω; C([0, T ]; H)) for all T > 0 and To prove Theorem A.7, we will consider (21) on the finite-dimensional space V N .
Theorem A.8. Let N ∈ N and Λ be a Borel probability measure on H. Then there is a weak solution of the finite-dimensional approximation of (21). That means, that there is a stochastic basis (Ω, F , (F t ) t∈[0,∞) , P) with a normal filtration, β 1 , . . . , β N mutually independent real-valued (F t )-Brownian motions and a predictable V N -valued process u with u ∈ L 2 (Ω; C([0, T ]; V N )) for all T > 0 such that  Proposition A.9 (Estimates for the norm). Assume that T > 0 and (Ω, F , (F t ) t∈[0,T ] , P) is a stochastic basis with a normal filtration. Then there is a constant C > 0 that only depends on the constants from Assumptions A.6, such that for all mutually independent real-valued (F t )-Brownian motions (β l ) l∈N , N ∈ N and all V N -valued predictable processes u ∈ L 2 (Ω; C([0, T ]; V N )) with Additionally, there is a q 0 > 2 such that u(0) ∈ L q (Ω; H) for some q ∈ (2, q 0 ) implies u ∈ L ∞ (0, T ; L q (Ω; H)) with Proof. From Proposition A.3 we conclude that the following Itô formula holds for the norm of solutions where B N : V → L 2 (U ; H) is B restricted to the finite-dimensional subspaces, For q ≥ 1 we use the Itô formula for real-valued semimartingales to deduce that By taking the expectation and using the coercivity (22) as well as the growth bounds (24) we conclude for q ∈ [1, 1 + ε) with ε < α C where C depends on the constants from (22) and (24), that and with a Gronwall argument This already implies that there is a constant C > 0 such that Furthermore, we have for the stochastic integral in (28) And from (29) for q = 1 we infer and therefore E sup Lemma A.10. Let T > 0 and are sequentially continuous.
For B * we have with the growth bound (25) The right hand side is uniformly integrable, because u k (t) 2−δ V is bounded in L 2 2−δ (0, T ) and u k (t) 2 H is convergent in L 1 (0, T ). Therefore by Vitali's conver- because (u k ) k∈N is uniformly bounded in L 2 (0, T ; V ). As above one can conclude from the growth assumptions (23) and (25) and the fact that are uniformly integrable with respect to k and M . Hence, converge to 0 by first choosing M large such that the first terms on the right hand side become small and then choosing k large and using the convergences derived above.
Proof of Theorem A.7. For N ∈ N let V N := span({e 1 , . . . , e N }) and consider the V N -valued process u N from Theorem A.8. The process u N is a weak solution of the finite-dimensional approximation of (21) for a Wiener process W N on U with covariance operator Q N : U → span {g 1 , . . . , g N } , which is the orthogonal projection. We can assume that the processes (u N ) N ∈N are defined on one common probability space (Ω, F , P) = ( Furthermore, we can always assume that q > 2 is sufficiently small such that the following arguments hold. We can apply Proposition A.9 to infer that for all T > 0 the sequence (u N ) N ∈N is in N ∈ N uniformly bounded in L 2 (Ω; C([0, T ]; H)) ∩ L 2 (Ω; L 2 (0, T ; V )) ∩ L ∞ (0, T ; L q (Ω; H)).
Let Z be another separable Hilbert space with a Hilbert-Schmidt embedding V ′ ⊂ Z.
Because of (25) we have that B N (u N ) is uniformly bounded in L q (Ω; L q (0, T ; L 2 (U ; Z))), since Using the factorization method [51, Theorem 1.1] we get a uniform bound for u N ∈ L 2 (Ω; C 0,λ ([0, T ]; Z)) for some λ > 0. Now, consider another separable Hilbert space U 1 which is the completion of U with the respect to the scalar product g l1 , g l2 U1 = a 2 l1 g l1 , g l2 U for l 1 , l 2 ∈ N and (a l ) l∈N ⊂ R a square-summable sequence. Then U is densely embedded in U 1 with a Hilbert-Schmidt embedding and each W N can be understood as a Wiener process on U 1 with covariance operators uniformly bounded in L 1 (U 1 ). Hence, with a factorization argument for λ ∈ (0, 1 2 ) the (W N ) N are uniformly bounded in L 2 (Ω; C 0,λ ([0, T ]; U 1 )). For λ > 0 the embeddings where C loc ([0, ∞); (H, w)) and C loc ([0, ∞); (U 1 , w)) are endowed with the compactopen topology, if and only if for all T > 0 the set (with all of its elements restricted to [0, T ]) is compact in X T u × X T W , we conclude similarly to [10, Proof of Proposition 4.3] that the joint laws of (u N , W N ) are tight in X u × X W .
Because of Lemma A.11 we can apply the Jakubowski-Skorokhod representation theorem for tight sequences in nonmetric spaces [33,Theorem 2] to deduce the existence of a probability space (Ω,F ,P), an strictly increasing sequence (N m ) m∈N ⊂ N, X u -valued random variablesũ m ,ũ and X W -valued random variablesW m ,W for m ∈ N such thatũ m →ũP-a.s. in X u , W m →WP-a.s. in X W and the joint law of (ũ m ,W m ) coincides with the joint law of (u Nm , W Nm ) for all m ∈ N. To simplify the notation, we will assume that N m = m for m ∈ N.
Let N := {M ∈F |P(M ) = 0}. We will consider the augmented filtration (F t ) t∈[0,∞) which is defined bỹ The augmented filtration (F t ) t is a normal filtration. For m ∈ N we can do the same construction to define the natural filtration (G m t ) t and the augmented filtration (F m t ) t of (ũ m ,W m ). We fix k ∈ N and define for t ∈ [0, ∞) Since the joint law of (ũ m ,W m ) coincides with the joint law of (u m , W m ), we infer for l 1 , l 2 ∈ N and m large enough that The Burkholder-Davis-Gundy inequality for W m yields the uniform bound Similarly, because of Lemma A.10 and the convergence Q m → Id in L(U ), we conclude that in each of the above equations in (33) we have the pointwise convergence of the variables for m → ∞. Furthermore, the Burkholder-Davis-Gundy inequality for M m , the growth bound (25) and the estimates in Proposition A.9 imply for some q > 2 E|M m (t)| q = E|M m (t)| q ≤ CE Again with the Vitali convergence theorem, we can pass to the limit in the equations (33) and infer Since the equations in (34) hold for all γ, we conclude thatW is a square-integrable (G t ) t -martingale with (G t ) t -quadratic variation in U given by SinceW is continuous, we infer thatW is also a square-integrable (F t ) t -martingale and (36) also holds for the quadratic variation with respect to (F t ) t . By the Lévy martingale characterization [14,Theorem 4.6] we conclude thatW is a cylindrical (F t ) t -Wiener process on U . Similarly, as (35) holds for all γ, we conclude thatM is a square-integrable (G t ) t -martingale. SinceM is continuous by definition (31), it is also a square-integrable (F t ) t -martingale. From (35)   Continuity ofũ follows from Proposition A.3.