The infinitesimal generator of the stochastic Burgers equation

We develop a martingale approach for a class of singular stochastic PDEs of Burgers type (including fractional and multi-component Burgers equations) by constructing a domain for their infinitesimal generators. It was known that the domain must have trivial intersection with the usual cylinder test functions, and to overcome this difficulty we import some ideas from paracontrolled distributions to an infinite dimensional setting in order to construct a domain of controlled functions. Using the new domain, we are able to prove existence and uniqueness for the Kolmogorov backward equation and the martingale problem. We also extend the uniqueness result for"energy solutions"of the stochastic Burgers equation of [GP18a] to a wider class of equations.


Introduction
The (conservative) stochastic Burgers equation u : R + × T → R (or u : R + × R → R) where ξ is a space-time white noise, is one of the most prominent singular stochastic PDEs, a class of equations that are ill posed due to the interplay of very irregular noise and nonlinearities. The difficulty is that u only has only distributional regularity (under the stationary measure it is a white noise in space for all times), and therefore the meaning of the nonlinearity ∂ x u 2 is dubious. In recent years, new solution theories like regularity structures [Hai14,FH14] or paracontrolled distributions [GIP15,GP17] were developed for singular SPDEs, see [Gub18] for an up-to-date and fairly exhaustive review. These theories are based on analytic (as opposed to probabilistic) tools. In the example of the stochastic Burgers equation we roughly speaking use that u is not a generic distribution, but it is a local perturbation of a Gaussian (obtained from ξ). We construct the nonlinearity and some higher order terms of the Gaussian by explicit computation, and then we freeze the realization of ξ and of the nonlinear terms we just constructed and use pathwise and analytic tools to control the nonlinearity for the (better behaved) remainder. This requires the introduction of new function spaces of modelled (resp. paracontrolled ) distributions, which are exactly those distributions that are given as local perturbations as described before, and for which the nonlinearity can be constructed.
This point of view was first developed for rough paths, which provide a pathwise solution theory for SDEs by writing the solutions as local perturbations of the Brownian motion [Lyo98,Gub04]. Rough paths provide a new topology in which the solution depends continuously on the driving noise, and this is useful in a range of applications. But of course there are also probabilistic solution theories for SDEs, based for example on Itô or Stratonovich integration (strong solutions) or on the martingale problem (weak solutions), and depending on the aim it may be easier to work with the pathwise approach or with the probabilistic one.
For singular SPDEs the situation is somewhat unsatisfactory because while the pathwise approach applies to a wide range of equations, it seems completely unclear how to set up a general probabilistic solution theory. There are some exceptions, for example martingale techniques tend to work in the "not-so-singular" case when the equation is singular but can be handled via a simple change of variables and does not require regularity structures (sometimes this is called the Da Prato-Debussche regime [DPD03,DPD02]); see [Sta07,RZZ17] and also [FL18b,FL18a] for a an example where the change of variable trick does not work but still the equation is not too singular. For truly singular equations there exist only very few probabilistic results. R. and X. Zhu constructed a Dirichlet form for the Φ 4 3 equation and used the pathwise results to show that the form is closable [ZZ17], but it is unclear if the process corresponding to this form is the same as the one that is constructed via regularity structures or even if it is unique.
Maybe the strongest probabilistic results exist for the stochastic Burgers equation (1): First results, on which we comment more below, are due to Assing [Ass02]. In [GJ14] Gonçalves and Jara construct so called energy solutions to Burgers equation, roughly speaking by requiring that u solves the martingale problem associated to where ρ ε is an approximation of the identity. This notion of solution is refined in [GJ13] where the authors additionally impose a structural condition for the time-reversed process (u T −t ) t∈[0,T ] , and they assume that u is stationary. These two assumptions allow them to derive strong estimates for additive functionals · 0 F (u s )ds of u via the Itô trick. They obtain the existence of solutions in this stronger sense by Galerkin approximation. The uniqueness of the refined solutions is shown in [GP18a], leading to the first probabilistic well-posedness result for a truly singular SPDE. Extensions to non-stationary initial conditions that are absolutely continuous with respect to the invariant measure are given in [GJS15,GP18b], and in [Yan18] some singular initial conditions are considered; see also [GPS17] for Burgers equation with Dirichlet boundary condition.
The reason why the uniqueness proofs work is that we can linearize the equation via the Cole-Hopf transform: By formally applying Itô's formula, we get u = ∂ x log w, where w solves the stochastic heat equation ∂ t w = ∆w + √ 2wξ, a well posed equation which can be handled with classical SPDE approaches as in [Wal86,DPZ14,LR15]. The proof of uniqueness in [GP18a] shows that the formal application of Itô's formula is allowed for the refined energy solutions of [GJ13], and it heavily uses the good control of additive functionals from the Itô trick. Since the Cole-Hopf transform breaks down for essentially all other singular SPDEs, there is no hope of extending this approach to other equations.
The aim of the present paper is to provide a new and intrinsic (without transformation) martingale approach to some singular SPDEs. For simplicity we lead the main argumentation on the example of the Burgers equation, but later we also treat multi-component and fractional generalizations. The starting point is the observation that u is a Markov process, and therefore it must have an infinitesimal generator. The problem is that typical test functions on the state space of u (the space of Schwartz distributions) are not in the domain of the generator; this includes the test functions that are used in the energy solution approach, where the term for a test function f is not of finite variation, which means that for ϕ(u) = u(f ) the process (ϕ(u t )) t is not a semimartingale, and therefore ϕ cannot be in the domain of the generator. This was already noted by Assing [Ass02], who defined the formal generator on cylinder test functions but with image in the space of Hida distributions. Our aim is to find a (more complicated) domain of functions that are mapped to functions and not distributions under a formal extension of Assing's operator.
For this purpose we take inspiration from recent developments in singular diffusions, i.e. diffusions with distributional drift. Indeed, Assing's results show that we can interpret the Burgers drift as a distribution in an infinite-dimensional space, see also the discussion in [GP18b]. In finite-dimensions the papers [FRW03,FRW04,DD16,CC18] all follow a similar strategy for solving dX t = b(X t )dt + dW t for distributional b: They identify a domain for the formal infinitesimal generator L = 1 2 ∆ + b · ∇ and then show existence and uniqueness of solutions for the corresponding martingale problem. So far this is very classical, but the key observation is that for distributional b the domain does not contain any smooth functions and instead one has to identify a class of non-smooth test functions with a special structure, adapted to b. Roughly speaking they must be local perturbations of a linear functional constructed from b. This is very reminiscent of the rough path/regularity structure philosophy, and in fact [DD16,CC18] even use tools from rough paths resp. paracontrolled distributions.
We would like to use the same strategy for the stochastic Burgers equation. But rough paths and controlled distributions are finite-dimensional theories, and here we are in an infinite-dimensional setting. To set up a theory of function spaces and distributions we need a reference measure (in finite dimensions this is typically Lebesgue measure), and we will work with the stationary measure of u, the law µ of the white noise. This is a Gaussian measure, and by the chaos decomposition we can identify L 2 (µ) with the Fock space ∞ n=0 L 2 (T n ), which has enough structure so that we can do analysis on it. In that way we construct a domain of controlled functions which are mapped to L 2 (µ) by the generator of u, and this allows us to define a martingale problem for u. By Galerkin approximation we easily obtain the existence of solutions to the martingale problem. To see uniqueness, we use the duality with the Kolmogorov backward equation: Existence for the backward equation yields uniqueness for the martingale problem, and existence for the martingale problem yields uniqueness for the backward equation. We construct solutions to the backward equation by a compactness argument, relying on energy estimates in spaces of controlled functions. In that way we obtain a self-contained probabilistic solution theory for Burgers equation and fractional and multi-component generalizations. As a simple application we obtain the exponential L 2 -ergodicity of u. This program is somewhat related to the recent advances in regularization by noise for SPDEs [DPFPR13,DPFRV16], where unique strong solutions for SPDEs with bounded measurable drift are constructed by solving infinite-dimensional resolvent type equations. Of course our drift is strongly unbounded (and not even a function).
Finally we study the connection of our new approach with the Gonçalves-Jara energy solutions. One of the main motivations for studying the martingale problem for singular SPDEs is that it is a convenient tool for deriving the equations as scaling limits: The weak KPZ universality conjecture [Qua12,Cor12,QS15] says that a wide range of interface growth models converge in the weakly asymmetric or the weak noise regime to the Kardar-Parisi-Zhang (KPZ) equation h, for which u = ∂ x h. Energy solutions are a powerful tool for proving this convergence, see e.g. [GJ14,GJS15,FGS16,DGP17,GP16]. For that purpose it is crucial to work with nice test functions, and since there seems to be no easy way of identifying the complicated functions in the domain of the generator of u with test functions on the state space of a given particle system, our new martingale problem is probably not so useful for deriving convergence theorems. This motivates us to show that the notion of energy solution is in fact stronger than our martingale problem: Every energy solution solves the martingale problem for our generator, and thus it is unique in law.
All this also works for the fractional and multi-component Burgers equations. For the fractional Burgers equation we treat the entire locally subcritical regime (in the language of Hairer [Hai14]), which in regularity structures would lead to very complicated expansions, while for us a first order expansion is sufficient. Although by now there are very sophisticated and powerful black box type tools available in regularity structures that should handle the complicated expansion automatically [BHZ16,CH16,BCCH17].
The lynchpin of our approach is the Gaussian invariant measure µ, and in principle our methods should extend to other equations with Gaussian invariant measures, like the singular stochastic Navier Stokes equations studied in [GJ13]. It would even suffice to have a Gaussian quasi-invariant measure, i.e. a process which stays absolutely continuous (or rather incompressible in the sense of Definition 4.2) with respect to a Gaussian reference measure. But for general singular SPDEs we would have to work with more complicated measures like the Φ 4 3 measure for which we cannot reduce the analysis to the Fock space. Currently it is not clear how to extend our methods to such problems, so while we provide a probabilistic theory of some singular SPDEs that actually tackles the problem at hand and does not shift the singularity away via the Cole-Hopf transform, it is still much less general than regularity structures and it remains an important and challenging open problem to find more general probabilistic methods for singular SPDEs.
Structure of the paper Below we introduce some commonly used notation. In Section 2 we derive the explicit representation of the Burgers generator on Fock space and we introduce a space of controlled functions which are in the domain of the generator. In Section 3 we study the Kolmogorov backward equation and show the existence of solutions with the help of energy estimates for the Galerkin approximation and a compactness principle in controlled spaces, while uniqueness is easy. Section 4 is devoted to the martingale problem: We show existence via tightness of the Galerkin approximations and uniqueness via duality with the backward equation. As an application of our results we give a short proof of exponential L 2 -ergodicity. Finally we formulate a cylinder function martingale problem in the spirit of energy solutions, and we show that it is stronger than the martingale problem and therefore also has unique solutions. In Section 5 we briefly discuss extensions to multi-component and fractional Burgers equations. We do all the analysis on the torus, but with minor changes it carries over to the real line, as we explain in Section 5.3. The appendix collects some auxiliary estimates.
Acknowledgments The authors would like to thank the Isaac Newton Institute for Mathematical Sciences for support and hospitality during the programme SRQ: Scaling limits, Rough paths, Quantum field theory when part of the work on this paper was undertaken.
Notation We work on the torus T = R/Z and the Fourier transform of ϕ ∈ L 2 (T n ) is To shorten the formulas we usually write k 1:n := (k 1 , . . . , k n ), x 1:n := (x 1 , . . . , x n ) and x (· · · ) := (· · · )dx Moreover, we set Z 0 := Z \ {0} and we mostly restrict our attention to the subspace consists of all C k functions whose partial derivatives of order up to k have polynomial growth.
We write a b or b a if there exists a constant c > 0, independent of the variables under consideration, such that a c · b, and we write a ≃ b if a b and b a.

A domain for the Burgers generator 2.1 The generator of the Galerkin approximation
Consider the solution u m : R + × T → R to the Galerkin approximation of the conservative stochastic Burgers equation where ξ is a space-time white noise and is the projection onto the first 2m + 1 Fourier modes. Throughout the paper we write µ for the law of the average zero white noise on T, i.e. the centered Gaussian measure on for all f, g ∈ ε>0 H 1/2+ε (T).
Lemma 2.1. Equation (2) has a unique strong solution u ∈ C(R + , H −1/2− (T)) for every deterministic initial condition in H −1/2− (T). The solution is a strong Markov process and it is invariant under µ. Moreover, for all α > 1/2 there exists C = C(m, t, p, α) > 0 such that Proof. Local existence and uniqueness and the strong Markov property follow from standard theory because written in Fourier coordinates we can decouple where v m solves a finite-dimensional SDE with locally Lipschitz continuous coefficients and Z m solves an infinite-dimensional but linear SDE. Global existence and invariance of µ are shown in Section 4 of [GJ13]. It is well known and easy to check that Z m has trajectories in C(R + , H −1/2− (T)), see e.g. [GP15, Chapter 2.3], and v m has compact spectral support and therefore even v m ∈ C(R + , C ∞ (T)). Thus u m has trajectories in C(R + , H −1/2− (T)). The moment bound can be derived using similar arguments as in [GJ13]. The reason why v m behaves nicely is that B m leaves the L 2 (T) norm invariant since by the periodic boundary conditions. To see the invariance of µ we also need that B m is divergence free when written in Fourier coordinates. See Section 4 of [GJ13] or Lemma 5 of [GP16] for details.
We define the semigroup of u m for all bounded and measurable ϕ : , where under P u the process u m solves (2) with initial condition u.
Lemma 2.2. For all p ∈ [1, ∞] the family of operators (T m t ) t 0 can be uniquely extended to a contraction semigroup on L p (µ), which is continuous for p ∈ [1, ∞).
Proof. This uses the invariance of µ and follows by approximating L p functions with bounded measurable functions. To see the continuity for p ∈ [1, ∞) we use that in this case continuous bounded functions are dense in L p (µ).
Our next aim is to derive the generator of the semigroup T m on L 2 (µ). For that purpose let f 1 , . . . , f n ∈ C ∞ (T), let Φ ∈ C 2 p (R n , R), the C 2 functions with polynomially growing partial derivatives of order up to 2, and let ϕ ∈ C be a cylinder function of the form ϕ(u) = Φ(u(f 1 ), . . . , u(f n )). Let us introduce the notation where D x is the Malliavian derivative, and Then Itô's formula gives To extend this to more general functions ϕ and to obtain suitable bounds for L 0 and G m we work with the chaos expansion: Every function ϕ ∈ L 2 (µ) can be written uniquely as ϕ = n 0 W n (ϕ n ), where ϕ n ∈ L 2 0 (T n ) is symmetric in its n arguments and W n is an n-th order Wiener-Itô integral; here L 2 0 (T n ) = {ϕ ∈ L 2 (T n ) :φ(k) = 0∀k ∈ Z n \ Z n 0 }. Moreover, we have see [Nua06,Jan97] for details. If ϕ n ∈ L 2 0 (T n ) is not necessarily symmetric, then we define W n (ϕ n ) := W n ( ϕ n ), where ϕ n (x 1 , . . . , x n ) = 1 n! σ∈Σn ϕ n (x σ(1) , . . . , x σ(n) ) for the symmetric group Σ n is the symmetrization of ϕ n . By the triangle inequality we have ϕ n L 2 (T n ) ϕ n L 2 (T n ) .
Convention. In the following a norm · without subscript always denotes the L 2 (µ) norm, and an inner product ·, · without subscript denotes the L 2 (µ) inner product.
Lemma 2.3. Let ϕ ∈ C with chaos expansion ϕ = n 0 W n (ϕ n ). Then Proof. The proof is the same as for [GP18a, Lemma 3.7].
Let us look more carefully at the last term on the right hand side. Note that ∂ x ρ m x (s) = −∂ s ρ m s (x) and ϕ n is symmetric under exchange of its arguments. Therefore, by symmetrisation, since now ∂ s can be integrated by parts. We deduce that the last term in the decomposition of x B m (u)(x)D x W n (ϕ n ) vanishes. It remains to show that −G + m is the adjoint of G − m : Since ϕ n+1 is symmetric in its (n + 1) arguments, we have ϕ n+1 , ψ L 2 (T n+1 ) = ϕ n+1 ,ψ L 2 (T n+1 ) for all ψ, whereψ is the symmetrization of ψ, and therefore we do not need to symmetrize the kernel of G m + W n (ϕ n ) in the following computations: = n!2(n + 1)n = n!2(n + 1)n r 1:n,x,y,s where in the last step we renamed the variables as follows: r 1 ↔ x, r 2 → y, r i → r i−1 for i 3. The claim now follows by noting that ρ m r 1 (s) = ρ m s (r 1 ) and ∂ s ρ m s (x) = −∂ x ρ m x (s), and thus Remark 2.5. Note that the proof did not use the specific form of ρ m and the same arguments work as long as ρ m is an even function.
For m → ∞ the kernel for G m − W n (ϕ n ) formally converges to x,y ∂ x (δ x (y)δ x (r 1 ))ϕ n (x, y, r 2:n−1 ) = − x,y δ x (y)δ x (r 1 )∂ 1 ϕ n (x, y, r 2:n−1 ) = −∂ 1 ϕ n (r 1 , r 1 , r 2:n−1 ), where δ denotes the Dirac delta. For sufficiently nice ϕ n this kernel is in L 2 0 (T n−1 ). On the other hand we get for the formal limit G + W n (ϕ n ) the kernel which will never be in L 2 0 (T n+1 ), no matter how nice ϕ n is. The idea is therefore to construct (non-cylinder) functions for which suitable cancellations happen between L 0 and G and whose image under the Burgers generator L belongs to L 2 (µ).
It will be easier for us to work on the Fock space ΓL 2 = ΓL 2 (T) = ∞ n=0 L 2 0 (T n ) with norm where the functions ϕ n ∈ L 2 0 (T n ) are symmetric, and where we applied Parseval's identity. We also identify non-symmetric ϕ n ∈ L 2 (T n ) with their symmetrizations. As discussed above, the space ΓL 2 is isomorphic to L 2 (µ), and in the following we will often identify ϕ ∈ ΓL 2 with an element of L 2 (µ) and vice versa, without explicitly mentioning it.
Definition 2.6. The number operator (or Ornstein-Uhlenbeck operator) N acts on Fock space as (N ϕ) n := nϕ n . With a small abuse of notation, we denote with the same symbols L, L 0 , G m + , G m − the Fock version of the operators introduced above in such a way that on smooth cylinder functions we have: Lemma 2.7. In Fourier variables the operators L 0 , G m + , G m − are given by respectively, where the functions on the right hand side may not be symmetric, so strictly speaking we still have to symmetrize them.

A priori estimates for the Burgers drift
Here we derive some a priori estimates for the Burgers drift. We work with weighted norms on the Fock space.
for all γ 1/4, and for all γ > 1/4. Moreover, we have the following m-dependent bound: Proof. 1. We start by estimating G m − uniformly in m. Observe that, by the Cauchy-Schwartz inequality together with Lemma A.1 (here we need γ < 1/2, which holds because γ 1/4), and thus where in the last step we used the symmetry ofφ n+1 in the variables k 1:n+1 and that 3/2 − 2γ 1 (which is equivalent to γ 1/4). Therefore, we have uniformly in m 2. To derive the uniform-in-m bound for G m + , we apply Lemma A.1 in the fourth line below (using that 2γ > 1/2): 3. If we do not estimate G m + in a distributional space, we still have and thus as before w(N )G m + ϕ m 1/2 w(N + 1)(1 + N )(−L 0 ) 1/2 ϕ . By making similar use of the cutoff 1 |p|,|q| m we obtain also the bound Remark 2.9. For later reference let us recall the following bound from the proof: For all β < 1/2 we have Remark 2.10. In the study of fluctuations of additive functionals of Markov processes the graded sector condition is sometimes useful. This condition assumes that there exists a grading of orthogonal subspaces of L 2 (µ), such that on each subspace the quadratic form of the full generator can be controlled in terms of the one of the symmetric part of the generator, see [KLO12, Chapter 2.7.4]. However, while at first glance this may seem tailor made to describe our situation, there is an important restriction: For the graded sector condition we would need for some β < 1, see [KLO12, eq. (2.45)] while by Lemma 2.8 we can only take β = 1 and therefore the graded sector condition just barely fails. On the other hand we can take (−L 0 ) 1/4 ϕ n on the right hand side, and we will leverage this gain in regularity. And also for us it will be important that β = 1, for β > 1 the computations in Section 3.1 would not work.
Let u m be the solution of the martingale problem for the generator L m , with initial condition u. If ϕ ∈ C is a cylinder function, then ds by approximation (with a Bochner integral in L 2 (µ) on the right hand side), where we used our a priori estimates for G m ± and the trivial identity . From this it follows that ϕ ∈ dom(L m ), where now we take L m as the infinitesimal generator of (T m t ) t 0 (which is only a small abuse of notation, because both our definitions of L m agree on cylinder functions). Our claim now follows by standard results for semigroups in Banach spaces, see e.g. Proposition 1.1.5 in [EK86].

Controlled functions
Lemma 2.8 gives bounds for G m ϕ that are either in distributional spaces, or they diverge with m. To construct a domain that is mapped to ΓL 2 by the limiting generator L we need to consider functions ϕ for which Gϕ and L 0 ϕ have some cancellations, so in particular also L 0 ϕ should also be a distribution and ϕ should be non-smooth. For finite-dimensional diffusions with distributional drift b such functions can be constructed by solving the resolvent equation 12. This remark addresses experts in pathwise approaches to singular SPDEs and can be skipped: If b is in the Besov space C −α := B −α ∞,∞ for α > 0, then u → b · ∇u is well defined whenever u ∈ C 1+α+ε for some ε > 0, and in that case b · ∇u ∈ C −α . Since the Laplacian gains back 2 degrees of regularity we are mapped back to C 2−α , so we can close the estimates if 2 − α > 1 + α, i.e. if α < 1/2. This is the "Young regime", but the equation is subcritical for all α < 1 and for α ∈ [1/2, 1) we need to assume that u is not a generic element of the function space C 2−α but instead it has a special structure, adapted to the equation (it is modelled, or paracontrolled if α < 2/3).
In our case we could start with a nice function ψ ∈ ΓL 2 and try to solve so that Lϕ = λϕ − ψ, and the right hand side is in ΓL 2 if ϕ, ψ ∈ ΓL 2 . Regarding regularity with respect to L 0 , this is actually in the "Young regime": Gϕ is well defined whenever ϕ ∈ (−L 0 ) −1/4−ε ΓL 2 , and then G loses (−L 0 ) 3/4 "derivatives", while (λ−L 0 ) −1 gains enough regularity to map back to (−L 0 ) −1/4−ε ΓL 2 . But in this formal discussion we ignored the behavior with respect to N , and we are unable to solve the resolvent equation with such simple arguments because G introduces some growth in N which cannot be cured by applying (λ − L 0 ) −1 . So instead we introduce an approximation G ≻ of G which captures the singular part of the small scale behaviour of G by letting for a suitable (N -dependent) cutoff N n to be determined in order for this operator to be small enough in certain norms. Using G ≻ we introduce a controlled Ansatz of the form where ϕ ♯ will be chosen with sufficient regularity in ΓL 2 . Note that this is essentially the resolvent equation for λ = 0 and ψ = (−L 0 )ϕ ♯ , except that we replaced G with G ≻ . The motivation for this is that now we can trade in regularity in (−L 0 ) for regularity in N , as will become clear from the the computations below. A useful intuition about the Ansatz (11) is that, starting from a given test function ϕ ♯ , it "prepares" functions ϕ which have the right small scale behaviour compatible with the operator L.
We start by showing that for an appropriate cutoff N n we can solve equation (11) and express ϕ as a function of ϕ ♯ .
Definition 2.13. A weight is a map w : N 0 → (0, ∞) such that there exists C > 0 with w(n) Cw(n + i) for i ∈ {−1, 1}. In that case we write |w| for the smallest such constant C.
Remark 2.16. The cutoff N n for which we can construct Kϕ ♯ depends on the weight w via |w|; we say that the cutoff is adapted to the weight w if the construction of Lemma 2.14 works. If we consider weights w(n) = (1 + n) α with |α| K for a fixed K, then |w| is uniformly bounded and we can choose one cutoff which is adapted to all those weights. This is the situation that we are mostly interested in.
Remark 2.17. The bound (39) also holds for G m,≻ , which is defined analogously to G ≻ . Therefore, we can also construct a map K m : Let us write G ≺ = G − G ≻ . The following proposition controls LKϕ ♯ in terms of ϕ ♯ and it is formulated in the limit m → ∞. But by Remark 2.17 it is clear that similar bounds hold for L m K m ϕ ♯ , uniformly in m.
Lemma 2.19. For a given weight w and a cutoff as in Proposition 2.18 (for γ = 0) we set Proof. Let ψ be as in the statement of the lemma. Since such ψ are dense in w(N ) −1 ΓL 2 it suffices to construct ϕ M such that the inequalities (19) hold. For this purpose we apply Lemma 2.14 to find a unique function ϕ M ∈ w(N ) −1 ΓL 2 that satisfieŝ The first contribution ψ satisfies the required bounds by assumption, so it suffices to show that the second contribution, denote it as ψ M , satisfies But so that we can estimate this term similarly as in (9). If the cutoff MN n was independent of n, we would get w(N )(−L 0 )ψ M (MN n ) 1/2 w(N )(1 + N )(−L 0 ) 1/2 ϕ M from (9), so after including the factor N n ≃ (1 + n) 3 into the weight we get and then the first estimate of (20) follows from (13). Similarly and since N n ≃ (1 + n) 3 we get with (7), (8) that which together with (13) yields (20) and then (19).
Remark 2.20. As discussed before, the analysis above works also for L m and we define Remark 2.21. The same construction works for the operator L (λ) = L 0 +λG for λ ∈ R. For λ = 1 the intersection of the resulting domain D(L (λ) ) with D(L) consists only of constants. Proof. Note that ϕ ∈ D(L) implies L 0 ϕ, Gϕ ∈ (−L 0 ) 1/2 ΓL 2 and ϕ ∈ (−L 0 ) −1/2 (1+N ) −1 ΓL 2 and therefore we can conclude by approximation in the chain of equalities since all the inner products are well defined. In particular we used the antisymmetry of the form associated to G:

The Kolmogorov backward equation
So far we constructed a dense domain D(L) for the operator L. In this section we will analyze the Kolmogorov backward equation ∂ t ϕ = Lϕ. More precisely we consider the backward equation for the Galerkin approximation (2) with generator L m , and we derive uniform estimates in controlled spaces for the solution. By compactness, this gives the existence of strong solutions to the backward equation after removing the cutoff. Uniqueness easily follows from the dissipativity of L.

A priori bounds
Recall that T m is the semigroup generated by the Galerkin approximation u m , the solution to (2). Here we consider ϕ m (t) = T m t ϕ m 0 for ϕ m 0 ∈ D(L m ) and we derive some basic a priori estimates without using the controlled structure that we introduced above. Roughly speaking our aim is to gain some control of the growth in the chaos variable n by making use of the antisymmetry of G. In the next section we then handle the regularity with respect to (−L 0 ) by using the controlled structure.
Recall from Corollary 2.11 that ∂ t ϕ m = L m ϕ m , which yields and since we saw in Lemma 2.4 that ϕ m , G m ϕ m = − G m ϕ m , ϕ m , we get ϕ m , G m ϕ m = 0. However, this argument is only formal because G m introduces some growth in the chaos variable n, and we do not control the decay of ϕ m in n. Therefore, it is not clear that the "integration by parts" G m ϕ m , ϕ m = − ϕ m , G m ϕ m is allowed. To overcome this difficulty we fix a function w : N 0 → R + of compact support and note that where we used that L 0 commutes with w(N ). Let us focus on the second term on the right hand side, for which and therefore Lemma 2.4 gives Note that these computations are rigorous since the compact support of w ensures that the inner product involves only a finite number of chaoses. Let us denote h(n) = w(n) 2 − w(n − 1) 2 , then we have for the G m − term Consider now a function g : N 0 → [0, ∞) such that g(n) = 0 only if h(n) = h(n + 1) = 0; we will choose the precise form of g later. From the Cauchy-Schwarz inequality and estimate (8) we get and then Young's inequality gives for all δ > 0, which with another application of Young's inequality yields Recall that a dyadic partition of unity consists of two functions ρ −1 , ρ ∈ C ∞ c (R) such that with ρ i := ρ(2 −i ·) for i 0 we have supp(ρ i ) ∩ supp(ρ j ) for |i − j| > 1 and such that i −1 ρ i (x) ≡ 1; see [BCD11, Chapter 2.2] for a construction. In the following we write i ∼ j if 2 i ≃ 2 j , i.e. if |i − j| L for some fixed L > 0. Let us take w(n) = ρ i (n) for a dyadic partition of unity, and g = j∼i ρ j . Then we have for n ≃ 2 i h(n + 1)(n + 1) g(n + 1)g(n) 1/2 = |h(n + 1)(n + 1)| = |(ρ i (n + 1) 2 − ρ i (n) 2 )(n + 1)| (ρ i (n) + ρ i (n + 1))|ρ i (n + 1) − ρ i (n)|(n + 1) and h(n + 1)(n + 1)/(g(n + 1)g(n) 1/2 ) = 0 for n ≃ 2 i , and thus for all δ > 0 there exists From here we get for α ∈ R and a new C = C(δ, α) > 0 and taking δ < 1 we deduce the following bounds: as well as Proof. The first bound follows from our previous estimates and Gronwall's lemma. For the second bound, observe that and thus our estimates from above yield Then we take δ = 1/2, bring the integral from the right to the left, and send t → ∞ to deduce (23).
Remark 3.2. The norms appearing in the previous lemma can be brought to a more familiar "Sobolev" form with the help of the following simple result: For all α ∈ R and ϕ ∈ ΓL 2 we have where we used that i ρ 2 i (n) ≃ 1. The reason for not directly working with this Sobolev type norm is that the dyadic partition of unity allows us to localize in n and therefore to rigorously justify the operations on G m + and G m − above. Compared to a hard cutoff, the smooth dyadic partition has the advantage that the transition from the support of ρ i to its complement is well behaved, while for a hard cutoff it gives a too large contribution and we cannot close our estimates.
Corollary 3.3. We have for ϕ m , α, and C as in Lemma 3.1 and We just showed and therefore also ( which is the claimed estimate.

Controlled solutions
The a priori bounds (24) and (25) so that ϕ m = K m ϕ m,♯ .
Convention. Throughout this section we consider a cutoff N n in Lemma 2.14 that is adapted to the weights (1 + N ) β for all β that we encounter below. Proof. It follows from (25) and Lemma 2.14 that where in the last step we applied Proposition 2.18.
Unfortunately this estimate is not enough to show that ϕ m ∈ D(L m ), which requires a bound on (−L 0 )ϕ m,♯ + (1 + N ) 9/2 (−L 0 ) 1/2 ϕ m,♯ . And in fact we will need even more regularity to deduce compactness in the right spaces. So let us analyze the equation for ϕ m,♯ : The second term on the right hand side can be controlled with (17), which gives for γ 0 and δ > 0 The remaining term (−L 0 ) −1 G m,≻ ∂ t ϕ m is more tricky. We can plug in the explicit form of the time derivative, ∂ t ϕ m = G m,≺ ϕ m + L 0 ϕ m,♯ , but then we have a problem with the term L 0 ϕ m,♯ because it is of the same order as the leading term of the equation for ϕ m,♯ . Therefore, we would like to gain a bit of regularity in (−L 0 ) from (−L 0 ) −1 G m,≻ , and indeed this is possible by slightly adapting the proof of Lemma 2.14; see Lemma A.2 in the appendix for details. This gives for γ ∈ (1/2, 3/4) Recall that α(γ) = 9/2 + 7γ, and therefore 3/2 + α(γ − 1/4) α(γ) and the first term on the right hand side is bounded by the same expression as in (29). For the remaining term we apply Young's inequality for products: There exists p > 0 such that for all ε ∈ (0, 1) (30) The first term on the right hand side is under control by our a priori estimates, and the second term on the right hand side can be estimated using the regularizing effect of the semigroup (S t ) generated by L 0 : Proof. The variation of constants formula gives ϕ m,♯ (t) = S t ϕ m,♯ 0 + t 0 S t−s (∂ s −L 0 )ϕ m,♯ (s)ds, and by writing the explicit representation of S t and L 0 in Fourier variables we easily see that ( for all β 0. Since γ + 1/8 ∈ (1/2, 3/4) we can combine this with our previous estimates, and in that way we obtain for some K, K T > 0 and for t ∈ [0, T ] The right hand side does not depend on t, and therefore we can take the supremum over t ∈ [0, T ], and then we choose ε > 0 small enough so that KT 1/8 ε 1/2 and we bring the last term on the right hand side to the left and thus we obtain the claimed bound for the spatial regularity. For the temporal regularity, i.e. for ∂ t ϕ m,♯ , we simply use that ∂ t ϕ m,♯ = L 0 ϕ m,♯ + (∂ t − L 0 )ϕ m,♯ and apply the previous bounds to the two terms on the right hand side.
Remark 3.7. We focused on the backward equation, but by similar (and actually slightly easier) arguments we can also solve the resolvent equation (λ − L)ϕ = ψ for λ > C/2 and ψ ∈ U, where C > 0 is the constant from Corollary 3.3. Since U and D(L) are dense and L is dissipative by Lemma 2.22, it follows from Theorem 1.2.12 of [EK86] that L generates a strongly continuous contraction semigroup on L 2 (µ). Then we can apply Kolmogorov's extension theorem to construct, for all initial distributions with L 2 density with respect to µ, a Markov process corresponding to this semigroup. However, it seems a bit subtle how to get the continuity of trajectories or the link with the martingale problem in this way. To be in the setting of [EK86] we would need a semigroup on C b (E) for a locally compact and separable state space E, but since we are in infinite dimensions our state space cannot be locally compact. A canonical state space would be H −1/2− (T), but it also seems difficult to show that T t maps C b (H −1/2− ) to itself, let alone that it defines a semigroup on that space. So instead we will construct the process directly by a tightness argument based on the martingale problem.

The martingale problem
Definition 4.1. We say that a process (u t ) t 0 with trajectories in C(R + , S ′ ), where S ′ are the Schwartz distributions on T, solves the martingale problem for L with initial distribution ν if u 0 ∼ ν, if law(u t ) ≪ η for all t 0, and if for all ϕ ∈ D(L) and t 0 we have t 0 |Lϕ(u s )|ds < ∞ almost surely and the process is a martingale in the filtration generated by (u t ). Note that since ϕ and Lϕ are not cylinder functions we need the condition law(u t ) ≪ η in order for ϕ(u t ) and Lϕ(u t ) to be well defined.
Due to our lack of control for Lϕ outside of ΓL 2 , the following class of processes will play a major role in our study of the martingale problem. We will establish the existence of incompressible solutions to the martingale problem by a compactness argument. The duality of martingale problem and backward equation gives uniqueness of incompressible solutions to the martingale problem. Since the domain of L is rather complicated, we then study a "cylinder function martingale problem", a generalization of the energy solutions of [GJ14,GJ13,GP18a], and we show that every solution to the cylinder function martingale problem solves the martingale problem for L and in particular its law is unique.

Existence of solutions
In the following we show that under "near-stationary" initial conditions the Galerkin approximations (u m ) m solving (2) are tight in C(R + , S ′ ), and that any weak limit is an incompressible solution to the martingale problem for the generator L in the sense of Definitions 4.1 and 4.2. The following elementary inequality will be used throughout this section: Lemma 4.3. Let u m be a solution to (2) with law(u m 0 ) ≪ µ with density η ∈ L 2 (µ). Then we have for any Ψ : where P µ denotes the distribution of u m under the stationary initial condition u m 0 ∼ µ. In particular u m is incompressible.

Proof. The Cauchy-Schwarz inequality and Jensen's inequality yield
Recall that D x denotes the Malliavin derivative.
Moreover, for w : Proof. For cylinder functions ϕ the claim follows from Itô's formula, and in that case the Burkholder-Davis-Gundy inequality gives for all T > 0 The "energy" on the right hand side can be computed as To prove tightness we need to control higher moments, and for this purpose the following classical result is useful. Theorem 4.6. Let η ∈ L 2 (µ) and let u m be the solution to (2) with law(u m 0 ) ∼ ηdµ. Then (u m ) m∈N is tight in C(R + , S ′ ) and any weak limit u is incompressible and solves the martingale problem for L with initial distribution ηdµ.
3. It remains to show that any weak limit u of (u m ) solves the martingale problem for L with initial distribution ηdµ. As u m 0 ∼ ηdµ, also any weak limit has initial distribution ηdµ. To show that u solves the martingale problem, first observe that for any ϕ ∈ ΓL 2 we have and therefore we have for any bounded cylinder function ϕ M lim sup which shows that the left hand side equals zero because bounded cylinder functions are dense in ΓL 2 . The same argument also shows that lim sup m→∞ E This is not quite sufficient, because u m solves the martingale problem for L m and not for L. But since ϕ ∈ D(L) there exists ϕ ♯ with ϕ = Kϕ ♯ , so let us define ϕ m = K m ϕ ♯ . It follows from the dominated convergence theorem and the proof of Lemma 2.14 that ϕ m − ϕ → 0 as m → ∞. Moreover, L m ϕ m = L 0 ϕ ♯ + G m,≺ K m ϕ ♯ , and therefore another application of the dominated convergence theorem in the proof of Proposition 2.18 shows that L m ϕ m − Lϕ → 0. Hence which concludes the proof.
Remark 4.7. For simplicity we restricted our attention to η ∈ L 2 (µ). But it is clear that the same arguments show the existence of solutions to the martingale problem for initial conditions ηdµ with η ∈ L q (µ) for q > 1. The key requirement is that we can control expectations of u m in terms of higher moments under the stationary measure P µ , which also works for η ∈ L q (µ). The only difference is that for q < 2 we would have to adapt the definition of incompressibility and restrict our domain in the martingale problem from D(L) to D q ′ (L), where q ′ is the conjugate exponent of q. On the other hand the uniqueness proof below really needs η ∈ L 2 because we can only control the solution to the backward equation in spaces with polynomial weights, but not with exponential weights.

Uniqueness of solutions
Let η ∈ ΓL 2 be a probability density (with respect to µ). Let the process (u t ) t 0 ∈ C(R + , S ′ ) be incompressible and solve the martingale problem for L with initial distribution u 0 ∼ ηdµ.
Here we use the duality of martingale problem and backward equation to show that the law of u is unique and that it is a Markov process with invariant measure µ. In Lemma A.3 in the appendix we show that for ϕ ∈ C(R + , D(L)) ∩ C 1 (R + , ΓL 2 ) the process ϕ(t, u t ) − ϕ(0, u 0 ) − t 0 (∂ s + L)ϕ(s, u s )ds, for t 0, is a martingale. This will be an important tool in the following theorem: Theorem 4.8. Let η ∈ ΓL 2 with η 0 and ηdµ = 1. Let u be an incompressible solution to the martingale problem for L with initial distribution u 0 ∼ ηdµ. Then u is a Markov process and its law is unique. Moreover, µ is a stationary measure for u.
Next let ψ 1 be bounded and measurable and let ψ 2 ∈ U. Let 0 t 1 < t 2 and let ∂ t ϕ 2 = Lϕ 2 with initial condition ϕ 2 (0) = ψ 2 . Then Since we already saw that the law of u(t 1 ) is uniquely determined, also the law of (u t 1 , u t 2 ) is unique (by a monotone class argument). Iterating this, we get the uniqueness of law(u t 1 , . . . , u tn ) for all 0 t 1 < . . . < t n , and therefore the uniqueness of law(u t : t 0).
To see the Markov property let 0 t < s, let X be an F t = σ(u r : r t) measurable bounded random variable, and let ϕ 0 ∈ U. Then for the solution ϕ to the backward equation with initial condition ϕ(0) = ϕ 0 : . Now the Markov property follows by another density argument.
To see that u is stationary with respect to µ it suffices to consider the specific approximation that we used in the existence proof, i.e. the Galerkin approximation with initial distribution law(u m 0 ) = µ. This is a stationary process and it converges to the solution of the martingale problem, which therefore is also stationary and has initial distribution µ.
Remark 4.9. The strong Markov property seems difficult to obtain with our tools: If τ is a stopping time, then there is no reason why the law of u τ should be absolutely continuous with respect to µ, regardless of the initial distribution of u. Since such absolute continuity is crucial for our method, it is not clear how to deal with (u τ +t ) t 0 (although formally of course the same arguments as above apply).
Definition 4.10. We let (T t ) t be the semigroup on ΓL 2 given by, for ϕ ∈ ΓL 2 , where u solves the martingale problem for L with initial condition law(u 0 ) = ηdµ; for more general η ∈ ΓL 2 we define T t ϕ, η by linearity, so by the Riesz representation theorem we have indeed T t ϕ ∈ ΓL 2 .
Proposition 4.11. The semigroup (T t ) t is a strongly continuous contraction semigroup on ΓL 2 and for all ϕ ∈ D(L). The Hille-Yosida generatorL of (T t ) t is an extension of L.
Proof. Since µ is stationary for u we have T t ϕ ϕ for all t 0, i.e. (T t ) is a contraction semigroup. From the martingale problem it follows also that for ϕ ∈ D(L) and therefore we get the strong continuity in t, which by approximation extends to t → T t ψ for all ψ ∈ ΓL 2 . We conclude that ∂ t T t ϕ| t=0 = Lϕ and thusL is an extension of L.

Exponential ergodicity
The Burgers generator formally satisfies a spectral gap estimate and should thus be exponentially L 2 ergodic. Indeed, its symmetric part is L 0 for which the spectral gap is known, and its antisymmetric part G should not contribute to the spectral gap estimate, see e.g. [GZ03, Definition 2.1]. Having identified a domain for L, we can make this formal argument rigorous. We remark that the ergodicity of Burgers equation was already shown in [HM18], even in a stronger sense. The only new result here is the exponential speed of convergence (and our proof is very simple). Consider ϕ ∈ U and let (ϕ(t)) be the unique solution to the backward equation that we constructed in Theorem 3.6 starting from ϕ(0) = ϕ. From Proposition 4.11 we know that T t ϕ = ϕ(t) and from Lemma 2.22 we obtain Assume that ϕdµ = ϕ 0 = 0 for the zero-th chaos component, which by construction holds whenever (K −1 ϕ) 0 = 0. Using the stationarity of (u t ) with respect to µ we see that then also (ϕ(t)) 0 = 0. Recall that F (ϕ(t)) n (k 1:n ) = 0 whenever k i = 0 for some i, which leads to (−L 0 ) 1/2 ϕ(t) 2 |2π| 2 ϕ(t) 2 , and thus ∂ t ϕ(t) 2 −8π 2 ϕ(t) 2 , and then from Gronwall's inequality This holds for all ϕ ∈ U with ϕdµ = 0, but since the left and right hand side can be controlled in terms of ϕ it extends to all ϕ ∈ ΓL 2 with ϕdµ = 0. There are two main consequences: . Therefore, µ is ergodic and in particular there is no invariant measure that is absolutely continuous with respect to µ, other than µ itself.

Martingale problem with cylinder functions
The martingale approach to Burgers equation is particularly useful for proving that the equation arises in the scaling limit of particle systems. The disadvantage of the martingale problem based on controlled functions is that, given a microscopic system for which we want to prove that it scales to Burgers equation, it may be difficult to find similar controlled functions before passing to the limit. Instead it is often more natural to derive a characterization of the scaling limit based on cylinder test functions. Here we show that in some cases this characterization already implies that the limit solves our martingale problem for the controlled domain of the generator, and therefore it is unique in law. The biggest restriction is that we have to assume that the process allows for the Itô trick: Definition 4.12. A process (u t ) t 0 with trajectories in C(R + , S ′ ) solves the cylinder function martingale problem for L with initial distribution ν if u 0 ∼ ν, and if the following conditions are satisfied: ϕ locally uniformly in t, namely u is incompressible; ii. there exists an approximation of the identity (ρ ε ) such that for all f ∈ C ∞ (T) the process is a continuous martingale in the filtration generated by (u t ), where iii. the Itô trick works: for all cylinder functions ϕ and all p 1 we have Remark 4.13. In [GJ14,GJ13] so called stationary energy solutions to the Burgers equation are defined. The definition in [GJ13] makes the following alternative assumptions: i'. For all times t 0 the law of u t is µ; ii'. the conditions in ii. above hold, and additionally the process lim ε→0 t 0 L ε u s (f )ds has vanishing quadratic variation; ds is a continuous martingale in the filtration generated by (û t ), with quadratic variation M f t = 2t ∂ x f 2 L 2 . Clearly i'. and ii'. are stronger than i. and ii., and it is shown [GP18a, Proposition 3.2] that any process satisfying i'., ii'., iii'. also satisfies the first inequality in where the second inequality uses Remark 4.5 and the third inequality is from (35). If ϕdµ = 0, we can solve −L 0 ψ = ϕ and then (37) applied to ψ gives i.e. a stronger version of iii. Therefore, we also have uniqueness in law for any process which satisfies i'., ii'. and iii'., or alternatively i., ii., and (37).
Note that the constant c N 2p in iii. is not a typo. This is what we get if we consider a non-stationary process whose initial condition has an L 2 -density with respect to µ and we apply Lemma 4.3 to pass to a stationary process that has the properties above.
In the following we assume that u solves the cylinder function martingale problem for L with initial distribution ν, and we fix the filtration F t = σ(u s : s ∈ [0, t]), t 0.
Lemma 4.14. Let ϕ(u) = Φ(u(f 1 ), . . . , u(f k )) ∈ C be a cylinder function. Then the process Proof. Let us write . Then by Itô's formula the process In [GP18a,Corollary 3.17] it is shown that for all α < 3/4 and all T > 0 is the space of α-Hölder continuous functions. Strictly speaking in [GP18a] only the approximation ∂ x (Π m u) 2 is considered, but it is easy to generalize the analysis to ∂ x Π m (Π m u) 2 . In particular . . , u m s (f k ))dA m,f i s as a Young integral and e.g. by Theorem 1.16 in [LCL07] together with the Cauchy-Schwarz inequality we have whenever β > 1 − α and α < 3/4. Since ∂ i Φ is locally Lipschitz continuous with polynomial growth of the derivative and we can take β < α, the convergence of the first expectation to zero follows from the convergence of u m to u in L p (C α ([0, T ], R)). The second expectation is uniformly bounded in m by the considerations above, and therefore the difference converges to zero. Very similar arguments yield and since all the convergences are in L 1 we get that While it is not obvious from the proof, here we already used that the Itô trick works for (u t ). Indeed, Corollary 3.17 of [GP18a] crucially relies on this.
Theorem 4.15. Let u solve the cylinder function martingale problem for L with initial distribution ν. Then u solves the martingale problem for L in the sense of Section 4.1, and in particular its law is unique by Theorem 4.8.
Proof. Let ϕ ∈ D(L) and define ϕ M as the projection of ϕ onto the chaos components of order M, and in each chaos we project onto the Fourier modes |k| ∞ M. In particular, ϕ M ∈ C and by Lemma 4.14 the process Lϕ(u s )ds = 0, then the proof is complete. Since we saw in the proof of Lemma 4.14 that the integral t 0 L m ϕ M (u s )ds converges in L 1 , we can take out the limit in m from the expectation (or we could just apply Fatou's lemma), so that it suffices to show that the right hand side of the following inequality is zero: For the first term on the right hand side this follows from the fact that (−L 0 ) 1/2 ϕ (−L 0 ) 1/2 ϕ ♯ by Lemma 2.14 and from the dominated convergence theorem. For the second term on the right hand side we have by the triangle inequality and Lemma 2.8 The first term vanishes as M → ∞, by the same argument as before. The second term vanishes by the uniform estimates of Lemma 2.8 together with the dominated convergence theorem which shows that (G m − G) goes to zero as m → ∞.

Extensions
The uniqueness in law of solutions to the cylinder function martingale problem is not new, the stationary case was previously treated in [GP18a] and a non-stationary case (even slightly more general than the one we study here) in [GP18b]. This was extended to Burgers equation with Dirichlet boundary conditions in [GPS17]. However, these works are crucially based on the Cole-Hopf transform that linearizes the equation, and they do not say anything about the generator L. In the following we show that our arguments adapt to some variants of Burgers equation, none of which can be linearized via the Cole-Hopf transform. In that sense our new approach is much more robust than the previous one.

Multi-component Burgers equation
Let us consider the multi-component Burgers equation studied in [FH17,KM17]. This equation reads for u ∈ C(R + , (S ′ ) d ) as where (ξ 1 , . . . , ξ d ) are independent space-time white noises and we assume the so called trilinear condition of [FH17]: that Γ is symmetric in its three arguments (i, j, j ′ ). Under this condition the product measure µ ⊗d is invariant for u, also at the level of the Galerkin approximation, see Proposition 5.5 of [FH17]. We can interpret µ ⊗d as a white noise on L 2 0 ({1, . . . , d} × T) ≃ L 2 0 (T, R d ), equipped with the inner product f, g L 2 (T×{1,...,d}) : and where we assume thatf (i, 0) := f i (0) = 0 for all i, and similarly for g; see also Example 1.1.2 of [Nua06]. To simplify notation we write T d = T×{1, . . . , d} in what follows, not to be confused with T d . Cylinder functions now take the form ϕ(u) = Φ(u(f 1 ), . . . , u(f J )) for Φ ∈ C 2 p (R J ) and where the duality pairing u(f ) is defined as and in the following we switch between the notations f i (x) = f (i, x) depending on what is more convenient. The chaos expansion takes symmetric kernels ϕ n ∈ L 2 0 (T n d ) as input, and the Malliavin derivative acts on the cylinder function ϕ(u) = Φ(u(f 1 ), . . . , u(f J )) with where from now on we write ζ for the elements of T d . We also have D ζ W n (ϕ n ) = nW n−1 (ϕ n (ζ, ·)) as for d = 1. Let us define formally where δ (jx) (iy) = 1 i=j δ(x − y). Then the Burgers part of the generator is formally given by This becomes rigorous if we consider the Galerkin approximation with cutoff Π m , but for simplicity we continue to formally argue in the limit m = ∞. We have the following generalization of Lemma 2.4: and moreover we have for all ϕ n+1 ∈ L 2 (T n+1 ) and ϕ n ∈ L 2 (T n ) Proof. This follows similarly as in Lemma 2.4, making constant use of the trilinear condition for Γ.
Proof. The proof is more or less the same as for d = 1.
In other words G + and G − are finite linear combinations of some mild variations of the operators that we considered in d = 1. In particular they satisfy all the same estimates and we obtain the existence and uniqueness of solutions for the martingale problem for L = L 0 + G + + G − as before, and also for the cylinder function martingale problem.

Fractional Burgers equation
In the paper [GJ13] the authors not only study our stochastic Burgers equation, but also the fractional generalization for θ > 1/2 and A = −∆. They define and construct stationary energy solutions for all θ > 1/2, and they prove uniqueness in distribution for θ > 5/4. Here we briefly sketch how to adapt our arguments to deduce the uniqueness for θ > 3/4, also for the non-stationary equation as long as the initial condition is absolutely continuous with density in L 2 (µ). Unfortunately we cannot treat the limiting case θ = 3/4 which would be scale-invariant and which plays an important role in the work [GJ18].
In Section 4 of [GJ13] it is shown that u is still invariant under the distribution µ of the white noise. By adapting the arguments of Lemma 3.7 in [GP18a] we see that the (formal) generator of u is given by where F (L θ ϕ) n (k 1:n ) = −(|2πk 1 | 2θ + · · · + |2πk n | 2θ )φ n (k 1:n ).
Up to multiples of N we can estimate (−L θ ) by (−L 0 ) θ and vice versa, so we would expect that (−L θ ) −1 gains regularity of order (−L 0 ) −θ . We saw in Lemma 2.8 that G loses (−L 0 ) 3/4 regularity, and therefore it is canonical to assume θ > 3/4, so that we can gain back more regularity from the linear part of the dynamics than the nonlinear part loses. To construct controlled functions we only need to slightly adapt Lemma 2.14 and replace (−L 0 ) −1 by (−L θ ) −1 . For simplicity we restrict our attention to θ 1 because this allows us to estimate Lemma 5.3. Let θ ∈ (3/4, 1], let w be a weight, let γ ∈ (1/4, 1/2], and let L 1. For where the implicit constant on the right hand side is independent of w. From here the construction of controlled functions ϕ = Kϕ ♯ = (−L θ ) −1 G ≻ + ϕ + ϕ ♯ for given ϕ ♯ works as in Lemma 2.14.
Proposition 2.18 remains essentially unchanged in our setting, because for ϕ = Kϕ ♯ we have Lϕ = G ≺ ϕ+L θ ϕ ♯ . The only difference is that, since we still want to measure regularity in terms of (−L 0 ), we have L θ ϕ ♯ N 1−θ (−L 0 )ϕ ♯ by Hölder's inequality. Also the proof of Lemma 2.19 carries over to our setting. And also the analysis of the backward equation is more or less the same as before. The main difference is that now we only have a priori estimates in (−L 0 ) −θ/2 ΓL 2 and no longer in (−L 0 ) −1/2 ΓL 2 (with weights in N ). But for the controlled analysis it is only important to have an a priori estimate in (−L 0 ) −1/4−δ ΓL 2 , because that is what we need to control the contribution from G ≺ . So since θ/2 > 3/8 > 1/4 the same arguments work, and then we obtain the existence and uniqueness of solutions to backward equation and martingale problem by the same arguments as for θ = 1, and also the cylinder function martingale problem has unique solutions in this case.

Burgers equation on the real line
Burgers equation on R + ×R is very similar to the case of periodic boundary conditions. The only difference is that now instead of sums over Fourier modes we have to consider integrals, which might lead to divergences at k ≃ 0. But since most of our estimates boil down to an application of Lemma A.1, and this lemma remains true if the sum in k is replaced by an integral, most of our estimates still work on the full space. In fact all estimates in Section 2 remain true, but some of them are not so useful any more because we no longer have ϕ (−L 0 ) γ ϕ for γ > 0 and ϕdµ = 0. But we can strengthen the results as follows (with the difference to the previous results marked in blue): • In Lemma 2.14 we can use the cutoff 1 |k 1:n |∞>Nn to estimate Similarly we get in Lemma A.2 the better bound w(N )(1 − L 0 ) γ (−L 0 ) −1 G ≻ ϕ |w| w(N )(1 + N ) 3/2 (−L 0 ) γ−1/4 ϕ .
• The definition of the domain in Lemma 2.19 is problematic now, because it does not even guarantee that D(L) ⊂ ΓL 2 . So instead we set • The analysis in Section 3.1 does not change, and Lemma 3.1 together with Corollary 3.3 give as an a priori bound on (1 + N ) α (1 − L 0 ) 1/2 ϕ m and (1 + N ) α ∂ t ϕ m in terms of ϕ m 0 . • In the controlled analysis of Section 3.2 we can strengthen the bound from Lemma 3.4 to control (1 + N ) α (1 − L 0 ) 1/2 ϕ m,♯ in terms of ϕ m,♯ 0 , and this is sufficient to control (1 − L 0 ) γ G m,≺ ϕ m . Also for the other terms we now bound (1 − L 0 ) γ (·) instead of (−L 0 ) γ (·). Here we need the strengthened version of Lemma A.2 mentioned above, and we also use that (1 + N ) α (1 − L 0 ) β S t ψ (t −β ∨ 1) (1 + N ) α ψ . In the end we get strong solutions to the backward equation for initial conditions in U α := γ∈(3/8,5/8) • Existence and uniqueness for the martingale problem are exactly the same as on the torus, the only difference is that we have to use the strengthened version of Proposition 2.18 to approximate cylinder functions by functions in D(L).
In that way all results from Section 2.3-4 apart from Section 4.3 carry over to Burgers equation on the full space. Of course the exponential ergodicity of Section 4.3 does not hold on the full space, because L 0 no longer has a spectral gap.

A Auxiliary results
The following simple estimate is used many times, so we formulate it as a lemma.
Lemma A.1. Let C 0, a > 1/2, and k ∈ Z be such that k 2 + C > 0. Then Proof. Since p 2 + (k − p) 2 ≃ p 2 + k 2 , we have and since 2a > 1 the integral on the right hand side is finite and our claim follows.
The convergence of the Lebesgue integrals is in L 1 , and therefore the martingale property is inherited in the limit: