Abstract
We develop a solution theory in Hölder spaces for a quasi-linear stochastic PDE driven by an additive noise. The key ingredients are two deterministic PDE lemmas which establish a priori Hölder bounds for a parabolic equation in divergence form with irregular right-hand-side term. We apply these bounds to the case of a right-hand-side noise term which is white in time and trace class in space, to obtain stretched exponential bounds for the Hölder semi-norms of the solution.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We are interested in the quasi-linear equation
with unknown \(u :\mathbb {R}_t \times \mathbb {R}^d_x \rightarrow \mathbb {R}\) for a nonlinearity \(A :\mathbb {R}^d \rightarrow \mathbb {R}^d\) that is uniformly elliptic. The right hand side \(\xi \) represents an irregular distribution; the key example we have in mind is a noise term which is “white in time” and “coloured in space”. The aim of this article is to develop a priori bounds in Hölder spaces leading to a solution theory for (1).
The regularity of the noise terms appearing in stochastic differential equations is often effectively measured on the Hölder scale. This is well known in the finite-dimensional case, the most classical example being Brownian motion, which has (locally) \(\alpha \)-Hölder continuous trajectories for any \(\alpha <\frac{1}{2}\). Statements in other scales of spaces, e.g. in \(L^2\)-based fractional Sobolev spaces are possible but are weaker: Brownian trajectories almost surely take values in \(H^{\alpha }_{loc}\) for \(\alpha < \frac{1}{2}\), but this does not even imply the continuity of trajectories. It thus seems natural to seek a solution theory in Hölder spaces also for stochastic partial differential equations.
In the case of semi-linear equations such a theory is by now classical and well-developed, see e.g. [1, 3, 11]. For example, in the case of the stochastic heat equation
the variation-of-constants formula leads to an explicit representation of v in terms of the heat kernel (the so-called “mild solutions”) which can be used to deduce optimal Hölder bounds. This approach extends to equations with lower-order non-linearities such as stochastic reaction–diffusion equations or the stochastic Navier–Stokes equation.
In the case of the quasi-linear equations we consider, there is no natural mild formulation of the equation. However, equations such as (1) have been treated since the 70’s (see e.g. the classical works [5, 9] or [10] for a more recent presentation) using a “variational formulation”, which relies on the theory of monotone operators and yields solutions that satisfy
for all \(T<\infty \) almost surely. In fact, these methods allow for much more general equations; generalisations include degenerate cases such as the porous medium equation.
The aim of the present article is to demonstrate how purely deterministic PDE arguments can be used to improve on the energy inequality (3) and obtain estimates on space–time Hölder norms of \( \nabla u\). Our main deterministic result, Corollary 1, states, roughly speaking, that we can bound the (parabolic) Hölder semi-norm \([\nabla u]_\alpha \) for solutions of u of (1) in terms of the corresponding semi-norm \([\nabla v]_\alpha \) for solutions of the linear problem (2). The proof splits into Lemma 1 where this bound is established for a small \(\alpha _0\) using the celebrated De Giorgi–Nash Theorem, and into Lemma 2 where it is upgraded to arbitrary \(\alpha \) by Schauder theory. The techniques employed follow classical PDE arguments, as developed for example in [6, 7], but they have to be adjusted to the low-regularity right hand side.
To illustrate the implications of our deterministic result in the case of random \(\xi \), we treat the case where \(\xi \) is a Gaussian distribution that is white in time and coloured in space. This type of noise is commonly studied in the literature, often using the “differential” notation
where W is a Wiener process with spatial covariance operator K. Our assumption on \(\xi \) corresponds to saying that K is a trace-class operator, which is precisely the assumption needed in the variational approach. We restrict ourselves to the case where \(\xi \) is periodic and compactly supported in time. This assumption is made to yield bounds on \(\nabla v\) which hold uniformly over space and time. The only stochastic ingredient of this article is Lemma 3, where Gaussian moments for \([\nabla v]_\alpha \) are established, using the covariance of \(\xi \) and its Gaussianity. Theorem 1 combines the main deterministic result, Corollary 1, and Lemma 3 to construct spatially periodic solutions u with zero initial data, i.e. \(u_{|t \le 0}=0\). We establish existence and uniqueness of solutions to (1) in Theorem 1, as well as stretched exponential moments for \([\nabla u]_\alpha \).
Our result is closely related to the recent work [2], where a Hölder theory for the quasi-linear stochastic PDE
is developed. The first step of that work is to consider the auxiliary equation
The authors use some a priori information in the spirit of the energy estimate (3) as well as martingale inequalities to get a priori control on \(\nabla v\). The key observation in their approach is that this a priori control on v allows to rewrite the equation for the remainder \(w = u-v\) as
and to obtain Hölder regularity for w using the De Giorgi–Nash Theorem. We pursue a similar strategy, and work with the equation for \(w= u-v\). However, (1) is more non-linear than (5) and the classical PDE results presented in [6, 7] do not immediately apply in this low-regularity situation. Our main deterministic result, Corollary 1, provides the necessary bound.
In a previous version of this work, see [8], we treated a quasi-linear equation
where \(\bar{\xi }\) is a space–time white noise over \(\mathbb {R}_t \times \mathbb {R}_x\), and derived a stretched exponential moment bound akin to (24) on the Hölder semi-norms \([u]_\alpha \). The results in the present article contain this result, up to the different treatment of large scales. Indeed, specialising (1) to the case \(d=1\) and differentiating with respect to x yields for \(\bar{u} = \partial _x u\)
which coincides with (6), noting that our assumptions on \(\xi \) cover the case where \(\bar{\xi }=\partial _x \xi \) is a space–time white noise in one spatial dimension, and that in the one-dimensional case our assumptions on A coincide with the assumptions imposed on \(\pi \) in [8]. The key difference between the approach proposed in [8] and the approach we present here is that the core arguments are now purely deterministic and the use of log-Sobolev inequalities can be fully avoided.
2 Setting
For the deterministic part of our paper we rewrite the noise term \(\xi \) as \(\partial _t v -\Delta v\) where v solves (2). We thus strive to get bounds of solutions to
in terms of v. Here and throughout the paper we interpret equations in the distributional sense over all of \(\mathbb {R}_t \times \mathbb {R}_x^d\). In order to stress the divergence form of the right hand side, we relabel those terms and write
for \(j = -\nabla v\). We present a Schauder theory where we estimate the solution \(\nabla u\) by the data (v, j) in the Hölder space \(C^\alpha \), always with respect to to the parabolic distance
This is slightly different from the standard Schauder theory in \(C^{1,\alpha }\), which cannot be applied due to the right-hand-side term \(\partial _tv\) that is irregular in time. In fact we shall control the \(C^{1,\alpha }\)-semi norm of \(w:=u-v\)
where \([\cdot ]_\alpha \) denotes the (parabolic) Hölder semi-norm on space–time \(\mathbb {R}_t\times \mathbb {R}_x^d\)
We make two assumptions on the nonlinearity \(A:\mathbb {R}^d\rightarrow \mathbb {R}^d\) in form of assumptions on the tensor field given by the derivative matrix DA:
Assumption 1
DA is uniformly elliptic in the sense that there exists a constant \(\lambda >0\) such that
Here, without loss of generality we normalized the upper bound to unity.
We will make use of (12) in the following form: For every spatial shift vector \(y\in \mathbb {R}^d\) we will work with the increment operator \(\delta _yu(t,x)=u(t,x+y)-u(t,x)\) and use the chain-type rule
where
Then (12) ensures that for all y we have uniform ellipticity of \(a_y\):
Assumption 2
DA is globally Lipschitz in the sense that there exists a constant \(\Lambda <\infty \) such that
We will make use of (15) in the following form: For any exponent \(\beta \in (0,1]\) we have the following estimate on the level of Hölder norms
We use Eq. (8) exclusively in the following form: We apply the increment operator \(\delta _y\) to it and obtain by (13)
which in terms of the difference \(w:=u-v\) we rewrite as
We establish our form of \(C^{1,\alpha }\)-Schauder theory, cf. Corollary 1, in two lemmas. While the Lemma 1 just relies on the uniform ellipticity (12) and crucially uses the \(C^\alpha \)-a priori estimate for \(\delta _y w\) of De Giorgi and Nash based on (17), Lemma 2 uses also the Lipschitz continuity (15) and proceeds by a more standard Schauder-type argument.
Lemma 1
There exists an exponent \(\alpha _1=\alpha _1(d,\lambda )\in (0,1)\) such that for any exponent \(\alpha _0\in (0,\alpha _1)\) we have
provided we already have the qualitative information that the left hand side is finite.
The critical point in the proof of Lemma 1 is that we extract control of \([\nabla w]_{\alpha _0}\) (and thus \([\nabla u]_{\alpha _0}\)) from (17) without having to pass to the limit in the difference (quotient) \(\delta _y\), which is not possible due to the low regularity of \(\nabla v\).
Lemma 2
Let \(\alpha _0\) be as in Lemma 1 and suppose that L is so small that
where \(P_R:=(-R^2,0)\times B_R\) denotes the (centered) parabolic cylinder of size R and \([\cdot ]_{\beta ,P_R}\) the \(\beta \)-Hölder semi-norm restricted to this set. Then we have for any exponent \(\alpha \in [\alpha _0,1)\)
Corollary 1
Let \(\alpha _0\) be as in Lemma 1. Then we have for any exponent \(\alpha \in (0,1)\)
provided we already have the qualitative information that \([\nabla u]_{\alpha _0}<\infty \).
To illustrate an application of Corollary 1, we treat the case where the right hand side is a stochastic noise which is white in time but coloured in space. Such a noise term is described by a Gaussian random distribution \(\xi \) over \((t,x) \in \mathbb {R}\times \mathbb {R}^d\), the probability distribution of which is characterized by having zero mean and
where \((\xi , \varphi )\) stands for \(\xi \) tested against the Schwartz function \(\varphi \in \mathcal {S}(\mathbb {R}\times \mathbb {R}^d)\) and \(\langle \cdot \rangle \) is used for the expectation of a random variable. The spatial correlation K can be seen as the kernel of a regularising operator. Such a noise term is standard in the SPDE literature, often written in “differential notation” as
where W is an \(L^2\)-valued Wiener process with covariance operator K, see e.g. [1, Sect. 5].
Denote by v the solution of the constant-coefficient heat Eq. (2). Under suitable conditions on the kernel K it is known that \(\nabla v\) is regular enough, i.e. \(\alpha \)-Hölder continuous, to apply the above deterministic theory. As illustration we treat the case where \(\xi \) is assumed to be 1-periodic in all spatial directions, say of period 1, and in addition localised to a compact time interval, say the interval [0, 1]. If we assume in addition that the probability distribution of \(\xi \) is translation invariant in the spatial directions, so that \(K(x,y) = K(x-y)\), we have the following convenient Fourier series representation
Here the \(\beta _k\) are complex-valued standard Brownian motions (i.e. real and imaginary parts are independent and satisfy \(\langle \mathfrak {R} (\beta _k(t))^2 \rangle = \langle \mathfrak {I} (\beta _k(t))^2 \rangle = \frac{t}{\sqrt{2}}\)), that are independent up to the constraint \(\beta _k = \overline{\beta }_{-k}\), which ensures that \(\xi \) is real-valued, and \(\dot{\beta }_k(t)\) stands for the distributional time derivative. The \(\hat{K}(k)\) are real-valued, non-negative and symmetric in the sense that \(\hat{K}(k) = \hat{K}(-k)\). The almost sure convergence of (21) in the space of distributions can be easily shown, but we adopt the slightly simpler framework to only work with v, which we define by its Fourier series representation:
In order to ensure that the gradient is well behaved we impose that there exists \(s >d \) such that for \(k\in (2 \pi \mathbb {Z})^d\)
where we have set the normalisation equal to 1 without loss of generality. Incidentally, this condition on s precisely says that the spatial covariance operator K is of trace class. Then we have the following lemma.
Lemma 3
Let v(t, x) be given by (22) for \(t> 0\) and set \(v_{t\le 0}=0\). Then for \(\alpha < \min \{ \frac{s - d}{2}, 1 \}\) there exists \(C_0 = C_0(d,\alpha ,s)< \infty \) such that
where \(\langle \cdot \rangle \) denotes the expectation with respect to the probability distribution of v.
Combining our main deterministic result, Corollary 1, with the stochastic result in Lemma 3 we arrive at the following theorem.
Theorem 1
Let A be uniformly elliptic with ellipticity contrast \(\lambda \) and let DA be Lipschitz continuous with constant \(\Lambda \) (in the sense of (12) and (15)). Let \(\alpha _0 = \alpha _0(d,\lambda ) \) be as in Lemma 1. Let v be given by (22) for a covariance operator K satisfying (23) for some \(s>d\). Then for almost all realisations of v, there exists a unique \(u = u(t,x)\) with the following properties:
-
u is continuous, 1-periodic in all spatial directions (i.e. \(u(t,x) = u(t,x+k)\) for all \(k \in \mathbb {Z}^d\)) and \(u_{|t \le 0} = 0\).
-
\([\nabla u ]_\alpha , [u-v]_{1+\alpha } < \infty \) for \(\alpha < \min \{ \frac{s-d}{2},1\}\).
-
u solves (7) in the distributional sense, i.e. for all Schwartz functions \(\varphi \in \mathcal {S}(\mathbb {R}\times \mathbb {R}^d)\)
$$\begin{aligned}&- \int \int \partial _t \varphi u dx dt+ \int \int \nabla \varphi \cdot A(\nabla u) dx dt \\&\qquad = - \int \int \partial _t \varphi v dx dt + \int \int \nabla \varphi \cdot \nabla v \; dx dt . \end{aligned}$$
Furthermore, for \(\alpha < \min \{ \frac{s-d}{2}, 1\}\) there exists \(C= C(d,\lambda ,\Lambda ,\alpha ,s)<\infty \) such that
where \(\langle \cdot \rangle \) denotes the expectation with respect to the probability distribution of v.
3 Proof of Theorem 1
Throughout this proof we use the symbol \(\lesssim \) for \(\le C(d,\lambda ,\Lambda ,\alpha ,s)\). All functions u, v, w etc. appearing in the proof are assumed to be one-periodic in all space directions.
We assume we are given continuous functions v and j with \([\nabla v]_\alpha \), \([j]_\alpha < \infty \) for an \(\alpha \in (0,1)\), which are 1-periodic in each spatial direction and with \(v_{|t \le 0} = j_{|t \le 0} =0\). We show that there exists a unique function u which is one-periodic in each spatial direction, satisfies \(u_{t \le 0} =0\) and which satisfies
for each Schwartz function \(\varphi \). In addition we show the bound
where \(N = \big ([\nabla v]_{\alpha }+[j]_{\alpha }\big )^\frac{\alpha }{\alpha _0}+ \big ([\nabla v]_{\alpha }+[j]_{\alpha }\big )\). The desired existence and uniqueness statement then follows, by applying this to the case where v is given by (22), \(j = -\nabla v\). For (24) we combine (26) and Lemma 3 to get for a suitable \(C=C(d,\lambda ,\Lambda ,\alpha ,s)\)
The existence of solutions follows by approximation through regularisation. Let \(j_\varepsilon \), \(v_\varepsilon \) be space–time regularisations (e.g. by convolution with suitable smooth kernel) of j, v satisfying \([j_\varepsilon ]_\alpha \le [j]_\alpha \), \([\nabla v_\varepsilon ]_\alpha \le [\nabla v]_\alpha \) and such that \({v_{\varepsilon }}_{|t\le -\varepsilon } = {j_{\varepsilon }}_{|t\le -\varepsilon } = 0\). Then by classical theory there exists a unique classical solution \(u_\varepsilon \) for
which is one-periodic in all spatial directions (see e.g. [7, Thm. 12.14] for a proof in the case of Dirichlet data on a bounded spatial domain. The case of the torus is only simpler). In this situation Corollary 1 applies and yields
This estimate together with the initial datum \({u_\varepsilon }_{|t=-\varepsilon } = {v_\varepsilon }_{|t=-\varepsilon } =0\) permit to apply the Arzelà–Ascoli Theorem and to conclude that up to choosing a subsequence \(u_\varepsilon -v_\varepsilon \rightarrow w\), \(\nabla (u_\varepsilon - v_\varepsilon ) \rightarrow \nabla w\), \(\nabla u_\varepsilon \rightarrow \nabla u\) locally uniformly for functions u, w with \(u=w+v\). Furthermore, w solves
in the distributional sense. Setting \(u = w +v\) we obtain (25) and the estimate (26) follows by passing to the limit in (27) using lower semi-continuity.
It only remains to argue for (pathwise) uniqueness. Assume thus that \(u^1\) and \(u^2\) are one-periodic in space, satisfy (25) and vanish for \(t \le 0\). Thus the difference \(\delta u := u^1 - u^2\) satisfies
in the distributional sense and \(\delta u_{|t=0} =0\). In order to show that \(\delta u=0\) we aim to test Eq. (28) against \(\delta u\) to obtain the identity
for all \(T \ge 0\). Once the identity (29) is justified, we can invoke the uniform ellipticity (14) once more and obtain the point-wise identity
so that (29) yields \(\delta u = 0\).
It thus remains to justify (29). For this we convolve (28) with a temporal regularising kernel at scale \(\varepsilon \) and then test against \(\delta u_\varepsilon \), the temporally regularised version of \(\delta u\). Here we use the fact that under the periodicity assumption the weak formulation (25) can be restated equivalently by replacing the space integrals over \(\mathbb {R}^d\) by integrals over \([0,1]^d\) and assuming that the test functions are also periodic. This yields for any \(T>0\)
We can pass to the limit \(\varepsilon \rightarrow 0\) on both sides using the fact that \(\delta u= (u^1- v)- (u^2-v)\) is \(\frac{1+\alpha }{2}\) -Hölder continuous in time and using the fact that \(\nabla u^1\) and \(\nabla u^2\) are \(\frac{\alpha }{2}\)-Hölder continuous in time.
4 Proof of Lemma 1
Throughout this proof we write \(\lesssim \) for \(\le C(d,\lambda ,\alpha _0)\).
Based on (17) and (14) we have by a localized version of the Hölder a priori estimate of De Giorgi and Nash that there exists an exponent \(\alpha _1=\alpha _1(d,\lambda )\in (0,1)\) such that for all shift vectors y, all length scales \(\ell \) and all space–time points z
where \(P_\ell (z)=(t-\ell ^2,t)\times B_\ell (x)\) denotes the parabolic cylinder centered around \(z=(t,x)\), and where \(\Vert \cdot \Vert _{P_\ell (z)}\) stands for the supremum norm restricted to the set \(P_\ell (z)\). The exponents of the \(\ell \)-factors in (31) are determined by scaling; smuggling in the constant k is possible since (14) is oblivious to changing \(\delta _yw\) by an additive constant. We refer to [7, Theorem 6.28] as one possible reference (with \(b\equiv 0\), \(c^0\equiv 0\), and \(g\equiv 0\) so that \(k_1=\sup _{Q(R)}|f|\) in the notation of that reference). We fix an exponent \(\alpha _0\in (0,\alpha _1)\) and take the supremum of (31) over all shift vectors y with \(|y|\le r\) for some \(r\le \ell \)
We first estimate the right-hand-side terms of (32). We start with the second right-hand-side term: From the definition (11) of the Hölder semi-norm and that of the parabolic cylinder, we obtain
so that
We now turn to the first right-hand-side term of (32): We first note that
where the right-hand-side infimum ranges over all \(c\in \mathbb {R}^d\). Indeed, passing to \(\tilde{w}(t,x)=w(t,x)-c\cdot y\), so that \(\nabla w-c\)\(=\nabla \tilde{w}\), and transforming \(\tilde{k}=k-c\cdot y\), so that \(\delta _yw-k=\delta _y\tilde{w}-\tilde{k}\), we see that (34) reduces to \(\Vert \delta _y\tilde{w}\Vert _{P_{2\ell }(z)}\)\(\le |y|\Vert \nabla \tilde{w}\Vert _{P_{3\ell }(z)}\), which because of \(|y|\le r\le \ell \) is a consequence of the mean-value theorem. Since obviously \(\inf _{c}\Vert \nabla w-c\Vert _{P_{3\ell }(z)}\)\(\le (3\ell )^{\alpha _0}[\nabla w]_{\alpha _0}\), we obtain
We finally turn to the left hand side term in (32) and note
Inserting (33), (35), and (36), into (32) we obtain
which we multiply with \(\frac{1}{r^{1+\alpha _0-\alpha _1}}\) to arrive at
We now argue that we are done once we establish the norm equivalence
Indeed, choosing \(\ell = Mr\) with \(M\ge 1\) to be chosen later, we take the supremum of (37) over all radii r and all space–time points z to arrive at
into which we insert (38)
By the triangle inequality in \([\cdot ]_{\alpha _0}\) we post-process this to
Since by our qualitative assumption of \([\nabla u]_{\alpha _0}<\infty \), and since \(\alpha _0<\alpha _1\), we may choose \(M=M(d,\lambda ,\alpha _0)\) so large that this turns into the desired (18).
We now turn to the norm equivalence (38); the elements of the argument are standard in modern Schauder theory, in the spirit of [4, Theorem 3.3.1]. By rotational symmetry, it is enough to establish
Let \(k=k(y,r,z)\) denote the optimal constant in the right hand side of (39). We first argue that for arbitrary but fixed point z, we have for all radii r
Indeed, based on the telescoping identity \(\delta _{2re_1}w\)\(=\delta _{re_1}w\)\(+\delta _{re_1}w(\cdot +re_1)\) we obtain by the triangle inequality the following additivity of k in the y-variable
Likewise, we have that k only mildly depends on the r-variable
From the two last estimates, we obtain (40). Since \(\alpha _0>0\), we learn from (40) that there exists a constant \(c_1(z)\) such that
along a given dyadic sequence of radii r. We insert this into the definition of N to obtain
from which, since in particular u and thus w is differentiable in the spatial variable, we learn that \(c_1(z)=\partial _1w(z)\) so that
Since we identified the limit, this now holds for any radius r (and not just the dyadic ones). Given two points z, \(z'\) we set \(r:=2d(z,z')\), cf. (9), and obtain
5 Proof of Lemma 2
Throughout this proof we use \(\lesssim \) for \(\le C(d,\lambda ,\Lambda ,\alpha _0,\alpha )\).
Let the two scales \(r\le \ell \le \frac{L}{4}\) be arbitrary and for the time being fixed. Let y be an arbitrary shift vector with \(|y|\le r\). By (16) in the localized form of \([a_y]_{\alpha _0,P_{3\ell }}\)\(\le \Lambda [\nabla u]_{\alpha _0,P_{3\ell +r}}\) and (19) we have
In conjunction with (14) we see that we may apply standard \(C^{1,\alpha _0}\)-Schauder theory to the parabolic operator \(\partial _t-\nabla \cdot a_y\nabla \) when localized to \(P_{3\ell }\). We learn from rescaling according to \((t,x)=(\ell ^2\hat{t},\ell \hat{x})\) that (42) is exactly the control on the coefficient needed so that the constant in this localized Schauder theory is of the desired form \(C(d,\lambda ,\Lambda ,\alpha _0,\alpha )\). We refer to [7, Theorem 4.8] for a possible reference (with \(b\equiv 0\), \(c\equiv 0\), \(g\equiv 0\) in the notation of that reference). We apply this to the increment \(\delta _yw\), cf. (17), to the effect of
We first argue that we may upgrade (43) to
The first ingredient in passing from (43) to (44) is the following elementary interpolation estimate
where \((\cdot )_r\) denotes convolution on scale r in the spatial variable. Here comes the argument for (45) where without loss of generality we may assume \(r=1\) and restrict to estimating the first component \(\partial _1w\) of the gradient. Given \((t,x)\in P_1\) this follows from combining the following immediate consequences of the mean-value theorem
so that c in (45) is given by \((\delta _{e_1}w)_1(0,0)\). The second ingredient in passing from (43) to (44) is
In order to see this we apply the spatial convolution operator \((\cdot )_r\) to (17) to the effect of
From this representation and \(r\le \ell \) we obtain the estimate
which yields (46) because of \(r\le \ell \). Inserting (43) into (46), and the outcome into (45), we obtain (44).
We now address the right-hand-side terms of (44). In view of (34) (slightly modified) we have for the first right-hand-side term
We now turn to the second right-hand-side term of (44) and note that
While obviously
we need a little argument to see
Indeed, let us focus on j; given two points z, \(z'\) in \(P_{3\ell }\) we write \(\delta _yj(z)-\delta _yj(z')\) in the two ways of \((j(z+(0,y))-j(z))\)\(-(j(z'+(0,y))-j(z'))\) and \((j(z+(0,y))-j(z'+(0,y)))\)\(-(j(z)-j(z'))\) to see that (because of \(|y|\le r\le \ell \))
and thus as desired
Inserting (49) and (50) into (48) we obtain
Inserting (47) and (51) into (44) we obtain the iterable form
Relabelling \(4\ell \) by \(\ell \) we obtain for all \(r\le \ell \le L\)
By the triangle inequality in \(\Vert \cdot \Vert \) and by \(\sup _{r\le L}r^{-\alpha }\inf _c\Vert \nabla v-c\Vert _{P_r}\)\(\le [\nabla v]_{\alpha ,P_L}\) this may be upgraded to
Slaving \(\ell \) to r via \(\ell =Mr\) for some \(M\ge 1\) to be chosen later, we obtain from distinguishing the ranges \(r\le \frac{L}{M}\) and \(\frac{L}{M}\le r\le L\) that
Clearly, the first right-hand-side term is controlled as follows
Hence fixing an \(M=M(d,\lambda ,\Lambda ,\alpha _0,\alpha )\) sufficiently large, we may absorb the second right-hand-side term in (52) into the left hand side to obtain
For this, we do not need to know beforehand that the left hand side side is finite, since (52) also holds when the two suprema are restricted to \(\epsilon \le r\le L\) and \(\epsilon \le \ell \le L\) for any \(\epsilon >0\), which is finite since \(\nabla u\) is in particular assumed to be continuous. Hence we obtain (53) with supremum restricted to \(\epsilon \le r\le L\), in which we now may let \(\epsilon \downarrow 0\) to recover the form as stated in (53). By the standard norm equivalence
and shifting the origin into an arbitrary \(z\in P_L\), we obtain (20) from (53).
6 Proof of Corollary 1
Throughout the proof, we use \(\lesssim \) as in Lemma 2.
By Lemma 1, the hypothesis (19) of Lemma 2 is satisfied provided we fix \(L=c([\nabla v]_{\alpha _0}+[j]_{\alpha _0})^{-\alpha _0}\) for \(c=c(d,\lambda ,\alpha )\) sufficiently small. Hence we obtain from (20) that
By translation invariance of our deterministic setting, this persists with \(P_{L}\) replaced by the shifted parabolic cylinder \(P_{L}(z)=z+P_L\) for any point \(z\in \mathbb {R}\times \mathbb {R}^d\), leading to
This yields the desired Hölder estimate on \(\nabla u\) for points \(z, z'\) at parabolic distance less than L. For those \(z, z'\) with \(d(z,z')\ge L\) we appeal once more to (18) in form of
where we used the definition of L in the last step.
It remains to estimate the \(C^{1-\alpha }\)-norm of \(w:=u-v\), more precisely, it just remains to estimate the temporal continuity, cf. (10):
To this purpose, we rewrite (8) as \(\partial _tw\)\(=\nabla \cdot (A(\nabla u)+j)\) to which we apply spatial convolution on scale r to be fixed later. This yields the estimate
Form this we deduce
We may take the convolution kernel \(\phi _r\) to be symmetric, so that in particular \(w_r(t,x)=\int \phi _r(x-y)(w(t,y)-\nabla w(t,x)\cdot (y-x))dy\), to the effect of
The last two estimates combine to
Optimizing through the choice of \(r=\sqrt{t}\) yields (54).
7 Proof of Lemma 3
Throughout this proof we use \(\lesssim \) for \(\le C(\alpha ,s,d)\).
Throughout the proof we fix \(j \in \{1, \ldots , d\}\) and set \(h = \partial _j v\). We aim to show that for C large enough and \(\alpha < \min \{ \frac{s - d}{2}, 1 \}\)
We assume without loss of generality that \(\frac{s-d}{2} < 1\).
First we recall that by definition v and h are 1-periodic in each spatial direction and \(v(t,x) = h(t,x) =0\) for \(t \le 0\). Furthermore for \(t>1\), h solves
so that by standard continuity properties of the heat equation in Hölder norms we have \([h]_{\alpha } \lesssim [h]'_{\alpha }\) where \( [h]'_{\alpha }\) is the local Hölder norm defined by
We thus aim to establish
The core stochastic ingredient for the proof of (55) is the following bound on second moments of increments of h: For (t, x), \((t',x') \in [0,1] \times \mathbb {R}^d\) we have
The argument for (56) is based on the following Fourier representation for h: For \(t \in [0,1]\) and \(x \in \mathbb {R}^d\) we get by differentiating (22) with respect to \(x_j\)
which for \(t' \le t\) leads to
In order to deduce (56), we use the triangle inequality and treat the cases \(t =t', \, x \ne x'\) and \(t \ne t', \, x = x'\) separately. In the first case we get using stationarity in x in the first and the symmetry of \(\hat{K}\) in the last equality
Now using the simple estimates \( \frac{k_j^2}{2|k|^2} \le \frac{1}{2}\), \(| 2- e^{i k \cdot (x-x') }- e^{-i k \cdot (x-x') }| \le \min \{ 4 , | k|^2 \, |x - x'|^2 \}\) as well as \( \big [ 1 - e^{-2t |k|^2 } \big ] \le 1\), and recalling condition (23) on \(\hat{K}\) this turns into the estimate
where we have used our assumption that \(s-d < 2\). In the same way we get by specialising (57) to \(x = x'\) and treating the case \(t \ge t'\)
Now using again \( \frac{k_j^2}{2|k|^2} \le \frac{1}{2}\) as well as
and using (23) once more this turns into
and thus (56) follows.
We now apply Kolmogorov’s continuity theorem to h; for the convenience of the reader we give a self-contained argument. We first appeal to Gaussianity to post-process (56), which we rewrite as
for a given scale R. By Gaussianity of h we can upgrade this estimate to
Thus proving the desired estimate (55) on Gaussian moments of the local Hölder-norm \([h]'_\alpha \) amounts to exchanging the expectation and the supremum over (t, x), (s, y) in (58) at the prize of a decreased Hölder exponent \(\alpha <\frac{s-d}{2}\). To this purpose, we now argue that for \(\alpha >0\), the supremum over a continuum can be replaced by the supremum over a discrete set: For \(R<1\) we define the grid
and claim that
where the first supremum runs over all R of the form \(2^{-N}\) for an integer \(N \ge 1\). Hence we have to show for arbitrary \((t,x),(s,y)\in (-1,0)\times (-1,1)^d\) that
By density, we may assume that \((t,x),(s,y)\in r^2\mathbb {Z}\times r\mathbb {Z}^d\) for some dyadic \(r=2^{-N}<1\) (this density argument requires the qualitative a priori information of the continuity of h, which can be circumvented by approximating h). For every dyadic level \(n=N,N-1,\ldots \) we now recursively construct two sequences \((t_n,x_n)\), \((s_n,y_n)\) of space–time points, starting from \((t_N,x_N)=(t,x)\) and \((s_N,y_N)=(s,y)\), with the following properties
-
a)
they are in the corresponding lattice of scale \(2^{-n}\), i. e. we have \((t_n,x_n),(s_n,x_n)\)\(\in (2^{-n})^2\mathbb {Z}\times 2^{-n}\mathbb {Z}^d\),
-
b)
they are close to their predecessors in the sense of \(|t_{n}-t_{n+1}|,|s_{n}-s_{n+1}|\le 3(2^{-(n+1)})^2\) and \(|x_{n,i}-x_{n+1,i}|,|y_{n,i}-y_{n+1,i}|\le 2^{-(n+1)}\), where \(x_{n,i}\), \(x_{n+1,i}\), \(\ldots \) denote the i-component of \(x_{n}\), \(x_{n+1}\), \(\ldots \). So by definition of \(\Theta \) we have
$$\begin{aligned} |h(t_n,x_n)-h(t_{n+1},x_{n+1})|&\lesssim \Theta (2^{-(n+1)})^\alpha ,\nonumber \\ |h(s_n,y_n)-h(s_{n+1},y_{n+1})|&\lesssim \Theta (2^{-(n+1)})^\alpha , \end{aligned}$$(60)and
-
c)
such that \(|t_n-s_n|\) and \(|x_n-y_n|\) are minimized among the points satisfying a) and b).
Because of the latter, we have
so that by the triangle inequality we gather from (60)
which yields (59).
Equipped with (59), we now may upgrade (58)–(55). Indeed, (59) can be reformulated on the level of indicator functions I as
where as in (59) R runs over all \(2^{-N}\) for integers \(N \ge 1\). Replacing the suprema by sums in order to take the expectation, we obtain
We now appeal to Chebyshev’s inequality in order to make use of (58):
where in the second step we have used that the number of pairs (t, x), (s, y) of neighboring lattice points is bounded by \(C\frac{1}{R^{2+d}}\) and in the last step we have used that stretched exponential decay (recall \(s-d-2\alpha >0\)) beats polynomial growth. The last estimate immediately yields (55).
References
Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions. Cambridge university press, Cambridge (2014)
Debussche, A., De Moor, S., Hofmanová, M.: A regularity result for quasilinear stochastic partial differential equations of parabolic type. SIAM J. Math. Anal. 47(2), 1590–1614 (2015)
Hairer, M.: An introduction to stochastic PDES (2009). arXiv preprint arXiv:0907.4178
Krylov, N.V.: Lectures on Elliptic and Parabolic Equations in Hölder Spaces, Graduate Studies in Mathematics, vol. 12. American Mathematical Society, Providence (1996)
Krylov, N.V., Rozovskii, B.L.: Stochastic evolution equations. J. Math. Sci. 16(4), 1233–1277 (1981)
Ladyzhenskaia, O.A., Solonnikov, V.A., Ural’tseva, N.N.: Linear and Quasi-linear Equations of Parabolic Type, vol. 23. American Mathematical Society, Providence (1988)
Lieberman, G.M.: Second Order Parabolic Differential Equations. World Scientific Publishing Co. Inc., River Edge (1996)
Otto, F., Weber, H.: Hölder regularity for a non-linear parabolic equation driven by space–time white noise. ArXiv e-prints (2015)
Pardoux, É.: Equations aux dérivées partielles stochastiques non lineaires monotones: Etude de solutions fortes de type Ito. Ph.D. thesis (1975)
Prévôt, C., Röckner, M.: A Concise Course on Stochastic Partial Differential Equations, vol. 1905. Springer, Berlin (2007)
Walsh, J.B.: An introduction to stochastic partial differential equations. In: Hennequin, P.L. (ed.) École d’Été de Probabilités de Saint Flour XIV-1984, pp. 265–439. Springer, Berlin (1986)
Acknowledgements
Funding was provided by Royal Society (University Research Fellowship UF14018).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Otto, F., Weber, H. Quasi-linear SPDEs in divergence form. Stoch PDE: Anal Comp 7, 64–85 (2019). https://doi.org/10.1007/s40072-018-0122-0
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40072-018-0122-0