Fluctuations Around a Homogenised Semilinear Random PDE

We consider a semilinear parabolic partial differential equation in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{R}_+\times [0,1]^d$$\end{document}R+×[0,1]d, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=1, 2$$\end{document}d=1,2 or 3, with a highly oscillating random potential and either homogeneous Dirichlet or Neumann boundary condition. If the amplitude of the oscillations has the right size compared to its typical spatiotemporal scale, then the solution of our equation converges to the solution of a deterministic homogenised parabolic PDE, which is a form of law of large numbers. Our main interest is in the associated central limit theorem. Namely, we study the limit of a properly rescaled difference between the initial random solution and its LLN limit. In dimension \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=1$$\end{document}d=1, that rescaled difference converges as one might expect to a centred Ornstein–Uhlenbeck process. However, in dimension \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=2$$\end{document}d=2, the limit is a non-centred Gaussian process, while in dimension \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=3$$\end{document}d=3, before taking the CLT limit, we need to subtract at an intermediate scale the solution of a deterministic parabolic PDE, subject (in the case of Neumann boundary condition) to a non-homogeneous Neumann boundary condition. Our proofs make use of the theory of regularity structures, in particular of the very recently developed methodology allowing to treat parabolic PDEs with boundary conditions within that theory.


Introduction
Fix D = [0, 1] d with d ≤ 3, and consider the family of functions u ε : [0, T ] × D → R solving the PDE ∂ t u ε (t, x) = u ε (t, x) + H (u ε (t, x)) + G(u ε (t, x))η ε (t, x), u ε (0, x) = u 0 (x), (1.1) endowed with either Dirichlet boundary conditions u ε (t, x) = 0 for x ∈ ∂ D or Neumann boundary conditions n(x), ∇u ε (t, x) = 0, where n denotes the outward facing unit vector normal to the boundary of D. The driving noise η ε appearing in this equation is given by η ε (t, x) = ε −1 η(ε −2 t, ε −1 x), (1.2) where η(t, x) is a stationary centred random field, which we do not assume Gaussian, but with relatively good mixing properties (see Assumption 2.1 for details) and moments of all orders after testing against a test function. Note that η ε is scaled by ε −1 rather than ε −(d+2)/2 , so the noise from [10,14] (which were restricted to d = 1) has been multiplied by ε d/2 . In the case when G is linear and H = 0, this problem has been well studied. See for example [1] for the case where furthermore η is Gaussian and constant in time [note that in our case m = 2 so the exponent α appearing there would equal 1 in our case, as used in (1.2)], [4] for a similar result in the non-Gaussian case.
(By stationarity, the functions κ p only depend on the differences of their arguments.) Here and below, we always use the convention that z (resp. z i ,z, etc) denotes a space-time coordinate given by z = (t, x) (resp. z i = (t i , x i ),z = (t,x), etc). We furthermore normalise our problem by assuming that the covariance of η integrates to 1 in the sense that R d+1 κ 2 (0, z) dz = 1.
(1. 3) In particular, we assume that κ 2 (0, ·) is absolutely integrable, but this will in any case follow from Assumption 2.1. Furthermore, define the constants = P(z) κ 2 (0, z) dz, = P(z) xκ 2 (0, z) dz, where P denotes the heat kernel, that is the fundamental solution to the heat equation on the whole space. Here and below, symbols drawn in red denote fixed constants, while symbols drawn in blue will later denote basis vectors of a suitable regularity structure associated to our problem. In dimension d = 1, our assumptions on κ 2 will guarantee that the integral converges absolutely, while in dimensions 2 and 3 our assumptions on κ 2 and κ 3 will guarantee that all of these integrals converge absolutely. The scaling ε −1 chosen in η ε is such that u ε converges as ε → 0 to a limit u (0) , which is our first result. Indeed, writing H η (u) = H (u) + G (u)G(u), (1.4) we have the following "law of large numbers": Theorem 1.1. Let u ε be as above and let u (0) be the (local) solution to the deterministic PDE ∂ t u (0) = u (0) + H η (u (0) ), (1.5) with the same initial condition u 0 ∈ C α as (1.1) [for some arbitrary α ∈ (0, 1)] and with homogeneous Dirichlet (resp. Neumann) boundary condition. In the case of Dirichlet boundary conditions, we impose that u 0 vanishes on the boundary. Assume that the functions G, H : R → R are of class C 5 and C 4 respectively, that the driving field η satisfies Assumption 2.1, and let furthermore T > 0 be such that the (possible) explosion time for u (0) is greater than T . Then, in probability and uniformly over [0, T ] × D, u ε converges to u (0) as ε → 0.
The proof of this result will be given in Section 3. Our main quantity of interest however are the fluctuations of u ε around its limit u (0) . One interesting feature of this problem is that in order to see these fluctuations, it is not sufficient to recenter around u (0) . Instead, as soon as d ≥ 2, a suitable first-order correction u (1) living at scale ε has to be subtracted first. Our precise "central limit theorem" then takes the following form: Theorem 1.2. Let u 0 be such that its extension to all of R d by reflections 1 is of class C 3 , let u ε and u (0) be as above and assume G, H , T and η are as in Theorem 1.1. Let furthermore u (1) = 0 for d = 1 and, for d ∈ {2, 3}, let u (1) be the solution to ∂ t u (1) = u (1) + H η (u (0) )u (1) + (u (0) , ∇u (0) ), with zero initial condition. (Note that is an R d -valued constant.) In the case of Neumann boundary condition, we furthermore impose that ∇u (1)  u ε − u (0) − εu (1) where v is the Gaussian process solving Remark 1.4. In the case of Neumann boundary conditions, it may appear paradoxal that, even though u ε , u (0) and v all satisfy homogeneous boundary conditions, u (1) does not! This phenomenon is very similar to the presence of the "boundary renormalisation" that can appear in the context of singular SPDEs [9]. There is no contradiction since the convergence v ε → v takes place in a very weak topology in which the notion of "normal derivative at the boundary" is meaningless in a pointwise sense. (A very simple example displaying a similar phenomenon is n −1/2 sin(nx), whose derivative at the origin diverges like √ n while that of its limit vanishes.) Remark 1. 5. Regarding the precise meaning of the equation fulfilled by u (1) in the case of Neumann boundary condition, denote by δ ∂ D the distribution on R × ∂ D given by where the integration over the faces of ∂ D are performed against the two-dimensional Lebesgue measure. With this notation, the solution to any equation of the form ∂ t u = u + f in D, ∇u(t, x), n(x) = g(x) on ∂ D, u(0, x) = u 0 (x), (1.9) (and in particular the equation determining u (1) ) is defined as the solution to the integral equation (1. 10) where P Neu denotes the homogeneous Neumann heat kernel, with the convention that g(t, x) = 0 for t ≤ 0, and where 1 D Here, we used the notation φ(z) η( dz) for the usual pairing between a distribution η and a suitable test function φ. To see that solutions to (1.10) and (1.9) do indeed coincide if f and g are sufficiently regular for the solution to be differentiable up to the boundary, it suffices to note that the mild formulation is equivalent to the weak formulation, see for example [20], with the term gδ ∂ D appearing as the boundary term when integrating by parts. Remark 1. 6. In dimension d = 1, the term u (1) in (1.7) is of course redundant. In dimension d = 2, it is still the case that ε −d/2 (u ε − u (0) ) converges to a limit, but this limit is not centred anymore. In higher dimensions, additional corrections appear. We expect to have a result of the form (1.11) where u (0) is as above and theū (k) satisfy an equation of the type for some inhomogeneity k depending on the u ( ) for < k and some of their derivatives. Since v has vanishing expectation, we expect to also have so that the u (k) provide an expansion of Eu ε in powers of ε. Note however that the techniques used in this paper do not provide moment bounds on the solution, so that even in d ≤ 3 this would require some additional work.
Remark 1.7. The form of the terms appearing in the successive correctors as well as the constants multiplying them can in principle be derived from [2,Eq. 2.12] which describes the form of the counterterms ϒ associated to a given tree τ . The recursion given there suggests a correspondence ∼ G(u) with incoming edges corresponding to functional derivatives with respect to u (so ∼ G (u) and ∼ G (u)) and powers of X corresponding to formal directional derivatives, so ∼ G (u)∇u. This shows a priori that the counterterm multiplying for example must be of the form G (u)G(u), etc. The numerical constants multiplying these terms do however differ from those appearing in [2] since their meaning is slightly different: in [2] we use counterterms to "recenter" the original equation in order to obtain a finite limit while here we leave the original equation untouched and compute its "centering". If we leave aside the behaviour at the boundary, this in principle allows to guess the general form of the equations for the u (k) appearing in (1.11) for any dimension.
The most surprising part of Theorem 1.2 is surely the fact that in the case when u ε has homogeneous Neumann boundary conditions, even though v and u (0) both also have homogeneous boundary conditions, u (1) does not, which seems to contradict (1.7). This is of course not a contradiction but merely suggests that if we write v ε for the expression appearing under the limit in (1.7), then v ε exhibits a kind of boundary layer. Note also that the statement that "v satisfies homogeneous boundary conditions" only makes sense in terms of the integral equation that it solves since v itself is not differentiable at the boundary. (It is not even a function!) Before we proceed, let us give a heuristic explanation for the appearance of this boundary layer. Consider the simplest case H = 0, G(u) = u and u 0 > 0, in which case we can consider the Hopf-Cole transform h ε = log u ε , yielding ∂ t h ε = h ε + |∇h ε | 2 + η ε .
To leading order, one would expect the right hand side to behave like |∇h ε | 2 E|∇ Z ε | 2 , where Z ε solves ∂ t Z ε = Z ε +η ε endowed with homogeneous Neumann boundary conditions. It turns out that, in the interior of the domain, one has lim ε→0 E|∇ Z ε | 2 = , which allows one to "guess" the correct limit u (0) . On the boundary however ∇ Z ε = 0, so that one expects E|∇ Z ε | 2 − to be of order O(1) in a layer of width O(ε) around ∂ D. When going to the next scale, this results in a boundary correction of order O(ε −1 ) in this boundary layer, which precisely scales like a surface measure on the boundary. Remark 1.5 shows that the net effect of this correction is to modify the boundary condition.
The remainder of the article is structured as follows. First, in Section 2, we formulate our main assumption on the driving noise η and we show that this assumption is "reasonable" by exhibiting an explicit class of examples for which it is satisfied. In Section 3, we then show that the law of large numbers holds. Although this could probably be shown by "classical" means without too much effort, we will use the theory of regularity structures because it shortens the argument and allows us to introduce some results and notation that will be of use later on. In Section 4, we then show that the central limit theorem holds. The main tool in this proof is the convergence of a certain "model" for an appropriate regularity structure as well as refinements of the type of boundary estimates first considered in [9]. The convergence of the model is given in Section 5. "Appendix B" is devoted to the proof of a result showing that the operations of "convolution by a singular kernel" and "multiplication by a smooth function" almost commute, modulo a much smoother remainder, a fact that will undoubtedly sound familiar to anyone acquainted with microlocal analysis. "Appendix C" contains a version of the reconstruction theorem that is purpose-built to allow us to deal with modelled distribution that have very singular boundary behaviour and goes beyond the version obtained in [9]. This appendix was written in collaboration with Máté Gerencsér.

Assumptions on the Noise
In this section, we formulate our precise assumptions on the driving noise and we show that they are satisfied for example by a mollified Poisson process. In a nutshell, we want to assume that correlations are bounded by z −z −2c at small scales and z −z −2c at large scales with c = 1 2 − δ and c = d+2 2 + δ for some δ ∈ (0, 1 2 ). However, we also want to encode the fact that higher order cumulants behave "better" than what is obtained from simply using the Cauchy-Schwartz inequality. Note that our assumptions are trivially satisfied by any continuous Gaussian process with correlations that decay at least like z −z −2c . Here and below, · will always denote the parabolic distance between space-time points. It will be convenient (in particular in "Appendix A") to make sure that · is smooth away from the origin, so we set for example z 4 = (t, x) 4 = |x| 4 + |t| 2 .

Coalescence Trees
In order to formulate this precisely, we need a simplified version of the construction of [15,Appendix A]. Given any configuration (z 1 , . . . , z p ) of p points in R d+1 with all distances distinct, we associate to it a binary tree T in the following way. Consider Kruskal's algorithm [17] for constructing the minimal spanning tree of the complete graph with vertices {z 1 , . . . , z p } and edge-weights given by their (parabolic) distances. One way of formalising this is the following. Consider the set P p of partitions of p = {1, . . . , p}. We define a distance d z between subsets of p as the Hausdorff distance induced by {z 1 , . . . , z p }, namely We then define a map K : P p → P p in the following way. If π = { p }, then K (π ) = π . Otherwise, let A = B ∈ π be such that d z (A, B) ≤ d z (C, D) for all C, D ∈ π . Thanks to our assumption on the z i , this pair is necessarily unique. We then set that is K (π ) is obtained by coalescing the two sets A and B in the partition π . The vertices of T are then given by V T = n≥0 K n ({{1}, . . . , { p}}), that is V T consists of all the blocks of those partitions. The set V T comes with a natural partial order given by inclusion: It is easy to verify that T is a binary tree and that its leaves are precisely given by the singletons. It will be convenient to also add to V T a "point at infinity" which is connected to p by an edge ( p , ) and to view as the minimal element of V T .
We writeV T = V T \{{1}, . . . , { p}, } for the interior nodes. Each interior node A ∈V T has exactly two children A 1 and A 2 such that (A, A i ) ∈ E T for i = 1, 2. We then define an integer labelling n :V T → Z by n(A) = − log 2 d z (A 1 , A 2 ) . We will always view n as a function on all of V T with values in Z ∪ {±∞} by setting n( ) = −∞ and n({i}) = +∞ for i = 1, . . . , p. Note now that if A, B and C are three disjoint sets, then As a consequence, the map n is monotone increasing on V T . Furthermore, as in [15,Eq. A.15], there exist constants c, C depending only on p such that Given a configuration of points z = (z 1 , . . . , z p ) ∈ (R d+1 ) p , we now write t z = (T, n) for the corresponding data constructed as above. We furthermore define a function ρ : R + → R + by (Beware that ρ is an upper bound for the square root of the covariance between two points.) We then assume that the following bound holds: Assumption 2.1. With the notations as above, for any p ≥ 2 and any uniformly over all z ∈ (R d+1 ) p . (Recall that p is the root of the tree T .) Here and below, the length of the multiindex k should be interpreted in the parabolic sense, The additional condition that η takes values in L p for sufficiently high p is mainly technical and could probably be dropped with some additional effort. It will be used to boundR (d) ε in the proof of Proposition 4.16. The exponent (d + 2)/c is consistent with the condition on the correlation function in the sense that this is the lowest value of p for which L p loc ⊂ C −c .

Justification
We claim that the assumption on the noise is rather weak on the ground that many natural constructions yield stationary random processes that satisfy it. We provide details for the following example: Proposition 2.3. Let θ : R d+1 → R be smooth away from 0 and such that for Let μ be a Poisson point measure over R d+1 with intensity 1 and set η = μ θ, then η satisfies the above assumption.
Before proving this proposition, let us first establish a property of joint cumulants of integrals of deterministic functions with respect to a Poisson point measure.
Proof. It is sufficient to prove the result in case there exist disjoint Borel subsets A 1 , . . . , A k of R d+1 with finite Lebesgue measure such that for 1 ≤ i ≤ p, In this case, however, the result follows readily from the fact that the joint cumulant is p-linear, and that the joint cumulant of a collection of random variables which can be split into two mutually independent subcollections vanishes, see for example property (iii) in [19, p. 32].
Proof of Proposition 2.3. It follows from Lemma 2.4 that so it remains to obtain a bound on this integral. We now consider z 1 , . . . , z p to be fixed and we shall make use of the labelled tree (T, n) built from these points as above. We are first going to treat the simpler case with all k i vanishing and then show how the argument can be modified to deal with the general case.

Case 1.
The case with all k i = 0. For every edge e = (e,ē) ∈ E T and every n ∈ Z with n(e) < n < n(ē), we define the domain It is possible to convince oneself that, provided that the constant c appearing in the definition of D (e,n) is sufficiently large, one has e∈E T n(e)<n<n(ē) As a consequence, the integral appearing in (2.3) is bounded by some constant times e∈E T n(e)<n<n(ē) We first use the fact thatρ is decreasing to conclude that, for n(e) < n < n(ē), one has the bound This can be seen as follows. Write {v 1 , . . . , v k } for the (possibly empty) set of nodes inV T lying on the shortest path joiningē to (not includingē and themselves). We then have, for every j = 1, . . . , k, since the number of factors appearing in each term is the same. Similarly, we have It remains to observe that n∈Z 2 −(d+2)nρ (2 −n ) = n∈Z 2 (c−d−2)n ∧2 (2c−d−2)n < ∞, so that (2.4) is indeed bounded by the required expression.

Case 2.
Note that we actually showed that the expression (2.3) with θ replaced bȳ ρ is bounded by the right hand side of (2.1) with k i = 0. To obtain the general case, it therefore suffices to show that We note that for z in the support of χ 0 , for 1 ≤ i ≤ p, 2 z i − z ≥ 2 −n(i ↑ ) , thus yielding (2.6) as required.
To bound the final term, we note that its integrand can be written as a finite sum of terms of the form where k, k j,i ∈ Z d+1 + and k + i = j k j,i = k j . Each of these terms is bounded above by the indicator function of the support of χ j times Since for z in the support of χ j and i = j, 2 z − z j ≤ z i − z j , so that . Combining all of these bounds, we finally obtain at which point we apply again (2.6) to obtain the required bound.

Law of Large Numbers
The aim of this section is to use a simplified variant of the arguments in [14] to show that Theorem 1.1 holds. 2 Although it would probably not be much more involved to obtain this proof by usual techniques, we give a proof using regularity structures. The main reason is that this allows us to introduce in a simpler setting a number of notions and notations that will be useful in the proof of our main result later on.
Before we turn to the proof proper, let us comment on the way in which we deal with the Neumann boundary conditions. Writing P Neu for the Neumann heat kernel and using the notation z = (t, x) (and similarly for z ), we rewrite (1.1) as an integral equation . We also fix an arbitrary time horizon T ≤ 1 which is not a restriction since the argument can be iterated.
Following [9], we then construct two functions K on R d+1 and K ∂ on R d+1 × R d+1 such that K is compactly supported, K ∂ is supported on a strip of finite width around the diagonal, and the identity See "Appendix A" for more details on the construction of these kernels and a proof that this can be done in a way that is compatible with the results of [9,11] that we will use in our argument.
Remark 3.1. We make no claim on the values of K and K ∂ for arguments outside of [0, 1] × D. This is because these will always be integrated against functions that are supported on [0, 1] × D and only the values of the result inside the domain will matter.
We choose K in such a way that it coincides with the whole space heat kernel P on the (parabolic) ball of radius 1 and is compactly supported in the ball of radius 2. We furthermore choose K in such a way that it annihilates polynomials of degree up to 3, is invariant under the transformation (t, x) → (t, −x), and is such that the sum of its reflections agrees with the Neumann heat kernel on [0, 1] × D. (See "Appendix A" for more details.) For example, we can choose K as in [14]. The kernel K ∂ is a correction term that encodes the effect of the boundary condition. Regarding our regularity structure, we then proceed as if there was no boundary condition whatsoever: we construct models defined on the whole space that are translation invariant and we use convolution with K as our integration operator. We then define an operator P Neu on modelled distributions by setting and K is built in exactly the same way as in [11,Sec. 4]. Note thatK ∂ encodes the effect of the boundary condition. (There is a completely analogous definition in the case of Dirichlet boundary conditions.) Here, L γ : C γ → D γ denotes the "Taylor lift" given by where z = (t, x) and k denotes a multiindex in N 1+d . We now have the preliminaries in place to turn to the proof of Theorem 1.1.
Proof of Theorem 1.1. We use a strategy similar to that in [14,15], combining this with results from [9] to deal with the boundary conditions. We refer to [7,8,12] for introductions to the theory of regularity structures, as well as to [11] for details. In our present context, we use the regularity structure obtained by extending the usual polynomial structure with parabolic scaling with a symbol 1 of degree −1 − κ representing the driving noise η ε , as well as an abstract integration operator I of order 2 representing convolution with the (singular part of) heat kernel. As usual, we will often use graphical representations for the basis vectors in our regularity structure(s), and we decree that is our symbolic representation for 1 . (The reason for introducing the "accent" representing the index "1" will become clear later on where more general notations of this type are needed.) Although our goal is to consider (1.1) on the bounded domain D ⊂ R d , we construct the models for our regularity structure on the whole of R × R d .
With notations almost identical to those in [14] and the formula (3.19) there, it would then be natural to consider a fixed point problem of the type where 1 D + denotes the indicator function of the space-time domain R + × D. Leaving considerations regarding the precise spaces of modelled distributions in which this equation makes sense aside for the moment, it is straightforward to see as in [11] that if we solve (3.4) for the renormalised lift of η ε , that is the admissible model such thatˆ ε = η ε ,ˆ ε = η ε (K η ε ) − , (3.5) then the function u ε = R ε U actually solves (1.1). Indeed, iterating (3.4), we see that any solution U to such a fixed point problem is necessarily of the form for some continuous functions u and ∇u. (This is purely notational, ∇u is not the gradient of u, but can be interpreted as a kind of "renormalised gradient".) In particular, the factor multiplying P Neu 1 D + in the right hand side of (3.4) is given by where we projected onto terms of negative (or vanishing) degree. At this point, it then suffices to note that the application of the reconstruction operator to L yields The problem with the argument outlined above is that since deg < −1, the behaviour of the modelled distribution 1 D + L near ∂ D is such that the reconstruction operator is not a priori well-defined on it, see [9,Secs 4.1 & 4.2]. This is for precisely the same reason why the restriction of a generic distribution ζ ∈ C α to a "nice" domain D is only well-defined if α > −1. (For α ≤ −1 there are non-zero distributions with support contained in ∂ D.) Before we tackle this problem, recall the definition of the spaces D γ,η as in [11,Sec. 6] (the hyperplane P being given by the time slice t = 0) as well as the spaces D γ,w as in [9,Sec. 4] (in which case P 0 is again the time 0 slice while P 1 = R×∂ D). These two spaces are distinguished by the fact that η is a real exponent while w denotes a triple of exponents describing the singular behaviour near t = 0, ∂ D and the intersection of both regions respectively. It will be convenient to use the notation (α) 3 with α ∈ R as a shorthand for the triple (α, α, α).
The idea then is the following. First, we introduce a new symbol , also of degree −1−κ, but representing the function η ε 1 R×D instead of representing η ε and we add to our regularity structure the symbols X and . Write then V for the sector spanned by , , and X ,V for the sector spanned by , , and X , and ι : V →V for the linear map with ι = and similarly for the remaining basis vectors. We will furthermore only ever consider models with the property that for all τ ∈ V . Since ι commutes with the structure group and preserves degrees, it follows that F → ιF is continuous from D γ,w (V ) to D γ,w (V ) for all choices of exponents γ and w and, for γ > 0, the local reconstruction operatorR (which yields a distribution on R × (R d \∂ D) and is always well-defined) satisfies The reason for the introduction of these extra symbols is that we would like to interpret (3.4) as a fixed point problem in the space D 1+2κ,(2κ) 3 with values in V 0 (for small enough κ), where V 0 is spanned by 1, , and X . The problem now is that, for F ∈ D 1+2κ,(2κ) 3 , we have F ∈ D κ,(κ−1) 3 , but we lose some regularity at the boundary when multiplying by the indicator function of our domain (see [9]), so that we only have 1 D Since the boundary index is now below −1, it follows that the reconstruction operator of [9] is not well-defined on 1 D + F . By Theorem C.5, it is however perfectly well-defined on 1 + F = 1 + ι(F ) since only the temporal singularity index is below −1 (but above −2). Furthermore, one has the identity R(1 + F ) =R(1 D + F ) for test functions whose support does not intersect the boundary of D.
Recall now that, given a modelled distribution F and a distribution ζ agreeing withRF outside the boundary of D, [9, Lem 4.12] defines a modelled distribution K ζ F with improved regularity and such that RK ζ F = K ζ . Furthermore, the map (ζ, F) → K ζ F is Lipschitz continuous in the natural topologies. This allows us to define an operator P D : Our discussion suggests that, instead of (3.4), we should consider the fixed point problem which admits unique local solutions in D 1+2κ,(2κ) 3 (V ) by [9], which are continuous with respect to admissible models on the full regularity structure satisfying furthermore the consistency condition (3.7). Here, we need to choose κ small enough to guarantee that P Neu u 0 does indeed belong to the space C κ+2,(1−κ) 3 , which is possible thanks to our assumption that u 0 itself is Hölder continuous for some positive exponent.
Retracing the discussion given at the beginning of the proof, but now with the renormalised modelˆ ε such that, in addition to (3.5), one hasˆ ε ιτ = 1 R×Dˆ ε τ for τ ∈ V , we conclude that for ε > 0, solutions to (3.8) coincide with those of (1.1). We now refer to Theorem 4.8 which shows that the sequence of modelŝ ε converges, as ε → 0, to a limiting modelˆ such thatˆ =ˆ = 0, extended canonically to the whole regularity structure. It follows immediately that the solutionŪ to (3.4) with the modelˆ is such that u (0) = RŪ does indeed solve (1.5) as claimed.
Remark 3.2. Note that (3.7) does not force us to setˆ ε = 1 R×Dˆ ε = 1 R×D η ε , but we could have added a sufficiently regular distribution supported on ∂ D. This however would break the identity on [0, 1] × D and would therefore modify the boundary condition of the resulting solution.

Central Limit Theorem
We now turn to the proof of the main result of this article, Theorem 1.2. We will mainly focus on the case of Neumann boundary conditions in dimension d = 3, which is the most interesting (and technically most difficult) case. We set (4.1) With this notation, we then have in the case d ∈ {2, 3} where, setting we have the explicit expression for the remainder term Furthermore, due to the non-vanishing boundary condition of u (1) in the Neumann case, v ε is then endowed with the inhomogeneous boundary condition We now also incorporate the first part of the second line into the remainder, so that we can write where the remainder termR (d) ε is given bŷ Note that here we have H appearing in (4.3) rather than H η and that the two are related by (1.4).
√ ε, and we obtain in the same way the slightly simpler expression (The reason why the term containing does not appear in this expression is because this was generated by ∂ t u (1) which vanishes by definition in dimension one.) The exponent α appearing in this expression is of course arbitrary, but allowing to tune it will be convenient when expressing this as a fixed point problem.

Decomposition of the Solution
In order to show that v ε converges to a limit, it will be convenient to break it into a sum of three terms. The first term will be a straightforward approximation to the stochastic heat equation with noise strength G(u (0) ) and homogeneous boundary condition. The second term will converge to 0, but incorporates the diverging boundary condition, which is used to compensate a resonance appearing in its right hand side. The final term will be a remainder that is sufficiently regular to be dealt with by the techniques of [9]. For this, we write and, with the convention that G and its derivatives are always evaluated at u (0) , we set endowed with the boundary conditions on ∂ D as well as vanishing initial conditions. The reason why v (1) ε will actually converge to 0 despite the diverging boundary condition when d ≥ 2 is the following. Consider the function ε defined by Then, we will see in (4.24) that the behaviour of v (0) ε is locally very well described by that of where ε = K ξ ε as usual. This implies that the behaviour of the term ε η ε appearing in the right hand side of the equation for v (1) ε is well described locally by that of Where denotes multiplication by η ε as previously, while denotes multiplication by η ε 1 R×D . (This will be formalised later on.) We will see that, up to vanishingly small errors, ε ≈ ε − d 2 , while ε ≈ ε 1− d 2 cδ ∂ D for a suitable constant c (in fact a different constant for each face of ∂ D in general), so that the first term in (4.9) is cancelled up to small errors by the term −ε − d 2 (GG )(u (0) ) appearing in the equation for v (1) ε , while the second term in (4.9) is cancelled by the term −ε 1− d 2 cGG δ ∂ D created by the boundary condition. Note that this argument does not see much of a difference between the Neumann and Dirichlet cases. Indeed, if we want the right hand side of the equation for v (1) ε to converge to a limiting distribution for the latter, we also need to add a diverging (when d = 3) term proportional to δ ∂ D . The difference is that K Dir (z, z ) vanishes for z ∈ ∂ D, so that this term has no influence on v (1) ε in the Dirichlet case. The idea now is to proceed as follows: • In a first step, we describe in Section 4.2 a regularity structure that is sufficiently rich to allow us to give precise control on the behaviour of v (0) ε , v (1) ε and v ε . As already alluded to, this will in particular include symbols representing nonstationary space-time stochastic processes, but we will try to keep these to an absolute minimum. We also describe there the renormalisation procedure which allows us to construct a suitable (random) model.
• We then make precise the description (4.8) for v (0) ε by expressing it as a modelled distribution in this regularity structure, which in particular contains the two symbols and . The main conclusion in (4.24) will be that the presence of allows us to express v (0) ε as a modelled distribution with a good behaviour near ∂ D. (At this stage, we could of course also use one symbol only and define our model using the Neumann heat kernel only, but the decompositions (4.8) and (4.9) are convenient for the remainder of the argument and to be able to reuse existing results.) • In a second step, we show that if we set thenˆ ε converges to a limiting model as ε → 0, which furthermore has good restriction properties to D, uniformly in ε. Furthermore, sinceˆ ε is only singular near the boundary of D, we can describe v (0) ε by another modelled distribution with worse behaviour near the boundary, but which only uses "translation invariant symbols" in its description, see Lemma 4.10, which allows us to give a description of v (1) ε in terms of such symbols in Lemma 4.12.
• We set up a sufficiently large regularity structure so that we can formulate a fixed point problem for the remainderv ε and control its behaviour as ε → 0, see Propositions 4.16 and 4.18. Combining this with the convergence of the corresponding renormalised model which is performed in Theorem 4.8 but relies crucially on the next section, we are finally able to conclude.

Definition of the Ambient Regularity Structure
We start by defining a regularity structure that is sufficiently large to allow us to perform the steps mentioned above and in particular to formulate (4.2) as a fixed point problem for a modelled distribution V . This fixed point problem will be chosen precisely in such a way that if we take as our model the renormalised lift of the noise η ε , then the corresponding counterterms are precisely such that if V solves the fixed point problem, then RV solves (4.2). Let (Beware that j is simply a superscript in j i , not a power.) In the graphical notation analogous to [14], we will use "accents" to denote the upper and lower indices on = 0 0 = , so for example 2 = , 1 1 = , etc. We will also sometimes write j i instead of j i . The degree of these symbols is given by The main reason why we never consider these "noises" as having degree larger than δ−1 2 is that we want to view them as "noise types" and so the structure group should act trivially on them. A byproduct of this choice is that it allows us to deal with driving noises η that are themselves unbounded. We now build a regularity structure extending the one built in Section 3, in which (4.35) can be formalised, by applying the framework of [5,Sec. 5]. We take as our basic "noise types" the noises j i with 0 ≤ j ≤ i ≤ 2, as well as an additional noise type which will be used to represent the noise ξ ε , restricted to R × D. We also introduce two "edge types" I representing convolution by a suitable cutoff K of the standard heat kernel in the whole space and I I I representing the integral operator with kernel K Neu as in (3.1) (see "Appendix A" for a precise definition), both having degree 2. It will be convenient to also introduce edge types I i and I I I i with i = 1, . . . , d of degree 3 representing the integral operators with respectively. Finally, we introduce a "virtual" edge typesÎ of degree 2 which will allow us to produce a rule (in the technical sense of [5, Sec. 5]) generating the relevant trees containing non-translation invariant symbols I I I and , but without cluttering our regularity structure with unneeded symbols.
The rule used to generate the regularity structure is then given by R( Recall that, given a collection L of "edge types" (in our case, these are the "integration" types I, I I I, etc as well as the "noise" types j i and j i which are also interpreted as edges for the purpose of this discussion), a "rule" is a map R : L → P(P(L))\{∅}, where P(A) denotes the powerset of a set A,P(L) denotes the set of non-empty multisets with elements in L. 3 In other words, an element R(t) is a collection of "node types", with each node type being a collection of edge types, with repetitions allowed. In (4.12), we use the identification between a multiset and a formal monomial, that is I 2 denotes the multiset with one copy of and two copies of I.
The basis vectors for our regularity structure are rooted trees (V, E, ρ) with edges e ∈ E labelled by L and nodes v ∈ V labelled by N d , denoting polynomial factors, with the convention that 0 ∈ N. By convention, edges are oriented away from the root. Any node v ∈ V then has a "type" N (v) ∈P(L) given by the collection of the types of the outgoing edges adjacent to v. If v ∈ V \{ρ}, there is a unique edge coming into v and we write t(v) ∈ L for the type of the incoming edge. A tree is said to "conform" to a rule R if, whenever v ∈ V \{ρ}, one has Remark 4.1. Note thatÎ never appears inside any R(t), so that a tree conforming to the rule R is not allowed to have any edge of typeÎ. The only reason for its presence is to allow the root of a conforming tree to be of type N (ρ) ∈ R(Î), since otherwise the tree which will be used later on would not be conforming to our rule. Actually, the symbol , which was introduced in a completely ad hoc manner so far, will then be interpreted as = − We assign degrees to the components appearing here by (4.11) as well as It is straightforward to verify that this rule is subcritical and complete. We henceforth denote by (T , G) the (reduced in the terminology of [5, Sec. 6.4]) regularity structure generated by the rule R.

Description of the Models
Throughout this article, we will use the notation (possibly with additional decorations) for a continuous linear map : T → D (R d+1 ) such that there exists a (necessarily unique) admissible model ( , ) related to by [11,Sec. 8.3]. Here, we associate the kernel K to I andÎ, K i to I i , and K Neu , K Neu,i to I I I, I I I i respectively.
We henceforth make a slight abuse of terminology and call itself a model. The following notion of an "admissible" model is a slight strengthening of the usual one to our context which essentially states that our "square" symbols are the restrictions of the "round" symbols to R × D. [11,Def. 8.29] for the kernels listed above and furthermore, for any τ ∈ T and i, j such that Note furthermore that all basis vectorsτ ∈ T with degτ < −1 are of the formτ = j i τ (or similar with replaced by ) for some i, j and τ as in Definition 4.2. As a consequence, admissible models in our sense admit a canonical decomposition = + + − as in Assumption C.1 by setting + j i τ = j i τ . This is a crucial fact which allows us to use Theorem C.5.
We also define the following notion: We write T ⊂ T for the sector spanned by trees conforming to R and we call T the translation invariant sector. Given any admissible model for (T , G), we then write K, K i , K Neu , K Neu,i for the corresponding integration operators as defined in [11,Sec. 5]. We now provide a complete description of the renormalised modelsˆ ε we will use for this regularity structure. From now on, whenever we do not explicitly mention a model on (T , G), we assume that we talk about the specific (random) modelˆ ε , and not a general admissible model. In particular, all the variants of the spaces D γ used below will be the spaces associated to this model. In order to specify the ε-dependence, we will write R ε for the reconstruction operator associated to the modelˆ ε . Theorem 4.8 shows thatˆ ε converges to some limiting model as ε → 0, giving rise to a reconstruction operator R 0 . Whenever we simply write R, it denotes the reconstruction operator for a generic admissible model.
Writing 1 D for the indicator function of R × D, we first define a family of non-renormalised (random) models ε as the canonical lift for the "noise" given by as well as˜ ε I i ( ) =˜ ε ( − ). We furthermore impose admissibility which forces us to set˜ ε X τ = ε X ·˜ ε τ . One can verify that the only remaining basis vector of T of negative degree and not belonging to the translation invariant sector is . However, this element will never be needed for our considerations, so we do not need to specify the action of˜ ε on it. We now construct a renormalised modelˆ ε from˜ ε by applying a slight modification of the BPHZ renormalisation procedure [5,6] to the translation invariant sector (which can be viewed as a regularity structure in its own right, generated by the rule R ). Writing T − for the subspace of T consisting of symbols of strictly negative degree, we will defineˆ ε by an expression of the form where − is a certain linear operator from T into AlgT − ⊗ T , with AlgT − the free unital algebra generated by T − , and g ε is a character of AlgT − , which is canonically identified with an element of the dual space T * − . In the BPHZ renormalisation procedure, one should choose g ε of the form where A : AlgT − → AlgT is the "twisted antipode" [5] and E˜ ε is the character of AlgT determined by (E˜ ε )(τ ) = E(˜ ε τ )(φ), for any fixed test function φ with z k φ(z) dz = δ k,0 for |k| small enough. (In our case |k| ≤ 1 suffices.) Recall that − is an "extraction/contraction" operator which iterates over all possible ways of extracting divergent subsymbols of its argument, so for example The twisted antipode behaves in a somewhat similar fashion, in this case Remark 4.5. Inspection of the rule R and our degree assignment shows that the basis vectors of T − are given by (4.15) It will be convenient to have an alternative degree assignment deg on T which better reflects the self-similarity properties of our objects given by setting and then extending it as usual. Instead of choosing the character g ε as in the BPHZ specification (4.14), it will be convenient to choose it in a way such that, for some constants C(τ ) that are independent of ε, one has For degτ = 0, we choose g ε (τ ) = −C(τ ) whenever the symbol τ contains an "accent", that is one of the noises j i with i + j > 0. The only symbols of negative degree without accents that appear in the translation invariant sector of our regularity structure and that contain at least two copies of the noise are and , which are of vanishing degree deg in dimensions 2 and 3 respectively, and will be considered separately below.
The constants C(τ ) themselves are chosen to coincide with the ones appearing in Theorems 1.1 and 1.2 with the convention that for any symbol τ the constant C(τ ) is also written as the same symbol τ , but drawn in red and with its "accents" stripped. For example, we set We also set C(τ ) = 0 whenever τ contains only one instance of the noise, namely we set = = 0. (4.16) [In general, we should also set C(τ ) = 0 if τ is of the form τ = I(τ ) for some τ , but the only symbol of negative degree of this type appearing in this work is which is already covered by (4.16).] Other constants that will be relevant for our analysis (in dimension d = 3) are given by = 0 as well as The convergence of integrals corresponding to , and in dimension 3 can easily be verified by using our assumption on the cumulants and the selfsimilarity of the heat kernel. The convergence of the integral for is more subtle since deg = 0. As a consequence, although κ 2 (z, z ) decays fast enough when z − z is large, the function z → x P 2 (z) is homogeneous of (parabolic) degree −5 and is therefore not absolutely integrable at large scales. However, since it is odd under (t, x) → (t, −x), additional cancellations occur and the integral should be interpreted as which does converge absolutely, so we set g ε ( ) = − in dimension 3. In dimension 2, deg > 0, but deg = 0 and the expectation of˜ ε diverges logarithmically and the expression given above fails to converge. We then have no choice but to set We then have the following preliminary result: Proposition 4.6. The modelˆ ε restricted to the translation invariant sector converges as ε → 0 to the BPHZ modelˆ such thatˆ = ξ andˆ with g bphz ε defined as in (4.14). Thanks to Proposition 5.5, our noises are such that the norm of [6, Def. A.18] is finite, uniformly in ε, for the cumulant homogeneity described in (5.3).
It therefore follows from [6, Thm 2.33] that bphz ε converges toˆ . Since furthermore the action of the character group of T − on the space of admissible models is continuous [5], it suffices to show that one can writê for some δg ε ∈ T * − with lim ε→0 δg ε = 0. For this, the following result is useful: Let g,ḡ ∈ T * − such that furthermore g(τ ) =ḡ(τ ) = 0 for every τ of the type (4.16). Then, one has the identity Proof. It was shown in [5,11] that where − : AlgT − → AlgT − ⊗AlgT − is an extraction/contraction operator defined just like above, but extended multiplicatively to AlgT − and such that only those terms are kept that actually belong to AlgT − ⊗ AlgT − (that is every factor needs to be of negative degree on both sides of the tensor product). Inspection of the list (4.15) reveals that in our case, the only situation in which we have a "subsymbol" of negative degree appearing in any of our symbol in such a way that the contracted symbol is still of negative degree is when the subsymbol contains only one noise. We conclude that and the claim follows.
We conclude from Lemma 4.7 that (4.18) holds with δg ε = g ε − g bphz ε , so that it remains to show that lim ε→0 δg ε (τ ) = 0 for every τ ∈ T − . For elements τ of the form τ = X k j i we have g ε (τ ) = g bphz ε (τ ) = 0. For all other elements τ with degτ ≤ 0, a simple scaling argument shows that g bphz ε (τ ) is given by the exact same formula as g ε (τ ), except that all instances of the heat kernel P are replaced by K ε , where Note that K ε coincides with P in a parabolic ball of radius O(1/ε) around the origin and vanishes outside of another ball of radius O(1/ε).
This in particular shows that which of courses converges to 0. The symbols τ differing from only by the placement of their accents then also converge since δg ε (τ ) is given by the same expression as (4.19), except for being multiplied by a higher power of ε.

Turning now to
(which only appears when d ∈ {2, 3}), it follows from [15, Lem. 6.8] that δg ε ( ) is a sum of terms of the form . It follows that |δg ε ( )| ε 2δ as desired, and δg ε ( ) is controlled in the same way by a higher power. In dimension 3, deg = 0 and it was shown in (4.17) that converges absolutely, which immediately implies that δg ε ( ) → 0.
To deal with the symbol (again with d ∈ {2, 3}), we first note that Assumption 2.1 implies the bound We also note that for any κ ∈ [0, d] one has the bound (4.20) This allows us to make use of [13,Thm 4.3]. Since deg = 2−d 2 we apply the bound (4.20) with κ = δ + d−2 2 which, in the notation of [13], yields a bound of the type for some ε-dependent kernel assignment (K ,R) ∈ K − 0 × K + 0 with bounds that are independent of ε and the Feynman diagram Here, the first coordinate of the label for each edge denotes its small-scale degree assignment while the second coordinate denotes its large-scale degree assignment. It is straightforward to verify that the small-scale degree assignment for this diagram satisfies the assumption of [13,Prop. 2.3], so that it does not require renormalisation. Furthermore, the large-scale degree assignment is seen to satisfy the assumption of [13,Thm 4.3], which guarantees that the integral converges absolutely and is bounded independently of ε, so that |δg ε ( )| ≤ ε κ . The symbols and (in dimension 3) which have vanishing degree deg, can be dealt with using the same technique, leading to the bound |δg ε ( )| + |δg ε ( )| ε 1/2 by using the Feynman diagrams , for bounding and for bounding . All three are easily seen to satisfy both the small-scale and largescale integrability conditions.
The remaining three symbols τ ∈ { , , } in the list (4.15) are all such that degτ > 0, so we need to show that g bphz ε (τ ) → 0. This can in principle be shown again by using the bounds from [13]. A "cheaper" way of showing that g bphz ε (τ ) → 0 is to note that in all three cases we can make use of a combination of Proposition 5.5 (used in the same way as in the proof of Proposition 5.9) and (4.14) to conclude that one can build a regularity structureT extending T (by adding additional "noises" representing η (α) ε for suitable choices of α) such that, for every τ , one can find κ > 0 and a symbolτ ∈T such that degτ > 0 and such that for some suitable fixed test function φ. Since we know from [6, Thm 2.33] that the BPHZ renormalised model bphz ε converges (so in particular remains uniformly bounded), we conclude that g bphz ε (τ ) → 0 as required. Proof. Convergence on the translation invariant sector was already shown in Proposition 4.6, so it only remains to consider the non-translation invariant symbols of negative degree. In dimension 3, these are , , , , , , , and . (There is also the symbol I i ( ), but applying the model to it yields the exact same distribution as when applying it to − .) The convergence on the remainder of the regularity structure is shown in the next section, but we collect the various parts of the proof here. Convergence of ε ,ˆ ε andˆ ε to ξ 1 D and 0 in C − 5 2 −κ , C −1−κ and C − 3 2 −κ respectively follows from Corollary 5.13 and Corollary 5.6.
Convergence ofˆ ε essentially follows from [6, Thm 2.31], noting that the bound for τ = does not require any derivative of the test function in this case, so that we immediately obtain the required bound by noting that ˆ ε (φ) = ˆ ε (1 R×D φ). The reason why this is so is that the only point in the proof where derivatives of the test function could potentially appear is in the bound [6,Eq. A.29] in the proof of [6,Thm A.32]. By the definition of R(S), these derivatives can only hit a test function in a situation where τ contains a connected subtree containing its root and of degree less than − d+2 2 . This is not the case for . Convergence ofˆ ε ,ˆ ε andˆ ε also follows in the same way. Regarding , we note that = + by (4.23) and we already obtained convergence of ε . Convergence ofˆ ε is the content of Theorem 5.20. The convergence of ε follows from Corollary A.3, combined with the usual Schauder estimates for integration against K .
Most of Section 5 is devoted to filling in the missing parts in the proof of Theorem 4.8, namely the proofs of Theorem 5.11 and Theorem 5.20.

Description of v (0) ε
We now have the notation in place to formalise the discussion given above regarding the local behaviour of v (0) ε and v (1) ε . Recall the definition (4.5) of v (0) ε which we rewrite in integral form as Depending on context, we will model v ε , we use Proposition 6.1, which guarantees that one can find ε ∈ C 3 2 −κ,− 1 2 −κ such that, setting 4 where the integration operators K Neu , and K Neu,i should be interpreted as the natural extensions of the corresponding integration operators described in Proposition 6.1, one has provided that we consider admissible modelsˆ ε withˆ ε = ξ ε 1 R×D . Note that V It is natural at this stage, as already mentioned earlier, to define the symbol as the element of T of degree − 1 2 − κ given by = − , (4.23) which is indeed consistent with (4.7). With this notation, Propositions A.2 and A.4 combined with (4.22) guarantee the existence of a function˜ ε ∈ C then we haveṼ As before, all these objects converge in the limit ε → 0 provided that the underlying models converge. To see this, we note that we can choosẽ and then apply Proposition A.4 to bound the first term and Proposition A.2 to bound the remaining two terms.
On the other hand, we defineV (0) ε by settinĝ where we use the convention that u (0) is extended outside of R + × D in any way that makes it globally C 3 . (For positive times, this is possible since the extension of u 0 to the whole space by suitable reflections is of class C 3 by our assumptions. For negative times, this is possible by Whitney's extension theorem [21].) Here we made a slight abuse of notation: the operator K should be interpreted in the sense of [9, Sec. 4.5] withR L 2 G(u (0) )1 + = G(u (0) )1 + ξ ε , which converges as ε → 0 by an argument very similar to that of Proposition 6.1, combined with the fact that G(u (0) ) is C 3 .
Note that we have The modelled distributionV (0) ε exhibits rather singular behaviour near the boundary of the domain, but by Proposition A.2 it converges as ε → 0 in D 1,w with w = − 1 2 −κ 3 [we use again the notation (η) 3 = (η, η, η)]. It has the advantage however of not involving the integration map I I I and the restricted noise , so it is purely described in terms of the "translation invariant" part of the regularity structure, that is the sector generated by I and . We summarise the above discussion with the following statement.

Lemma 4.10. We haveV
for some continuous function Recall that v (1) ε was defined in (4.6) as the solution to an inhomogeneous linear equation with inhomogeneous boundary conditions (et least in the Neumann case).
Regarding v (1) ε , we would like to describe it by a modelled distribution V (1) ε given by with P Neu as in (3.2) and where v (1, ) ε would be given by the solution to endowed with homogeneous Neumann boundary conditions. The reason why the identity R ε V ε holds (on R + × D as usual) is as follows. By (4.24), It then follows from the definition (4.13) of the renormalised model and the fact that ε that, applying the reconstruction operator to this expression yields We then note that the two terms appearing on the second line, when multiplied by another factor of G(u (0) ), are exactly the additional two terms appearing on the first line of (4.6), while the singular term involving Dirac masses on the boundary of D, when hit by K Neu , is responsible for the non-homogeneous boundary conditions. The problem with such a definition is that it yields a description of V (1) ε in terms of symbols involving , while the general convergence results of [6] require translation invariance of the noise objects, which is not the case here. If we were to try to improve the situation by replacing 1 + by 1 D + in (4.25), then we immediately run into the problem that the behaviour of this modelled distribution on the boundary of D is too singular for the general results of [9] to apply. This is not just a technicality: this singular behaviour is precisely what is responsible for the additional boundary renormalisation! Instead, we define V (1) ε as a sum of terms that is "equivalent" to the definition (4.25) in the sense that they reconstruct the same function, but such that each of the terms can be controlled in a slightly different, situation-specific, way.
We first deal with the boundary correction by setting (4.28) ) and since it belongs to a sector of regularity − 3 2 − 2κ, its reconstruction belongs to C − 3 2 −2κ . It then follows from Proposition A.2 that V (1,0) ε ∈ D 2,(0) 3 . We now break upṼ (0) ε in (4.25) as in (4.24) and deal with the first term. By Proposition 6.3, we can find V taking values in the classical Taylor polynomials, and such that The second term is dealt with similarly. As a consequence of Proposition 6.4 with g 1 = G G(u (0) ), g 2,i = G (u (0) ) 2 ∂ i u (1) and g 3 = G (u (0) )˜ ε (with˜ ε as in (4.25)), we can find V (1,2) ε ∈ D 2−2κ,w such that and such that furthermore V (1,2) ε takes values in the translation invariant sector and is of the form for some (1,2) ε taking values in the Taylor polynomials. In order to define V (1,3) ε , we make use of the following lemma: Lemma 4.11. Let φ ε be such that on R + × D one has the identity v (0) ε = G(u (0) )ˆ ε + φ ε 1, and one has φ ε (t, x) = 0 for t < 0 or x ∈ D. Then, for any α ∈ [0, 1), one has the bound E φ ε α+ 1 Proof. We decompose φ ε as and we treat the three terms separately. The first two terms are estimated by combining Proposition A.2 with Corollary 5.10. The bound on the last term follows from combining Proposition 5.9 with Corollary B.5. To apply the latter, we set θ = κ (small enough) and χ = α − 5 2 − κ, which yields a bound in C κ+2 on Since K i gains three derivatives, the term K i 1 D + ξ ε itself satisfies the required bound and we are done.
Recalling W 1,ε as defined in (4.26), it follows from Lemma 4.11 combined with the definition of the renormalised model that we can rewrite it as As a consequence of Proposition 5.9 (with α = κ), combined with Lemma 4.11 (with α = 1 2 ), we conclude that one has lim Indeed, the reconstruction theorem [9, Thm 4.9] and the multiplication rules [9,Lem. 4.3] imply that if g ∈ C γ,(η) 3 for η ≤ 0 and γ > 0, and ζ ∈ C β with β ≤ 0 and γ + β > 0 then, provided that η + β > −1, one has gζ ∈ C η+β . (View ζ as the constant function in a regularity structure containing only one symbol of degree β and apply the reconstruction theorem to gζ .) Since − 1 2 − 3κ > −1, we can multiply such a distribution by the indicator function of R + × D. It follows that, setting . Combining these definitions, we set Summarising this discussion, one has the following result: ε andV (1) ε is of the form
The only statement we haven't shown yet is that R 0V (1) = 0. Since we already know by (4.32) that W 1,ε converges to 0 and since R εV it remains by (4.25) to show that lim ε→0 R ε Ṽ (0) ε = 0. This in turn is immediate from (4.27) when combined with Theorem 4.8 which guarantees that the limiting model vanishes on all accented symbols.

Formulation of the Fixed Point Problem
Introduce now a modelled distribution V ε and, using the shorthandV ε =V  We claim that with this definition and provided that we consider the renormalised model constructed in Section 4.3, (4.35) admits a unique solution in D 3 2 −3κ,(0) 3 and, provided that we set ς = ε α ξ ε andς = ε −1−κ σ ε , one has v ε = R εVε . The reason for the appearance ofR (d) ε is to cancel out some additional unwanted terms arising from the renormalisation procedure. Before this, we formulate a technical lemma, where we write θ α for the C α norm of the function / distribution θ on D T = [0, T ] × D with T as in Theorem 1.2. Lemma 4.15. Let w,w with w L ∞ + w L ∞ ≤ ε −d/2 on the domain D T and let κ ∈ (0, 1 4 ). Writing X = ς − 1 2 +2κ + 1, one has the bounds for some proportionality constants depending only on u (0) , G and H . In dimensions 2 and 3, we set X = ς L p + 1 (for any fixed p ∈ [1, ∞]) and we have the bounds for κ sufficiently small. Proof. The case of dimensions 2 and 3 is straightforward to verify since all bounds are uniform. In dimension 1, the first term of (4.3) is easy to bound. To bound the second term, we use the fact that composition with a smooth function is a (locally) Lipschitz continuous operation in C 1 2 −κ , combined with the fact that the product is continuous as a bilinear map from C 1 2 −κ × C − 1 2 +2κ into C − 1 2 +2κ , see [3] or [8, Thm 13.16].  Proof. We first consider the case d = 3. Note first that, for any C 4 functionG, we have L(G(u (0) , u (1) )) ∈ D 2,(0) 3 . SinceV
Furthermore, the reconstruction operator of Theorem C.5 continuously maps the space D , which is then mapped continuously into C 2,(0) 3 by K ∂ by Proposition A.2, and therefore into D 2,(0) 3 by the Taylor lift L, again with arbitrarily small norm for small time intervals as a consequence of the bound (A.3) which also holds for K ∂ .
Note now that by Corollary 5.6, we have E ξ ε κ−2 ε − 1 2 −κ . As a consequence of (4.21), we conclude from this that E|v Since β > 1 2 , it follows from Lemma 4.15 that, for p = (d + 2)/c, uniformly over bounded sets for the modelˆ ε and over bounded sets for V ε +V Since u (0) and u (1) are bounded in C 1 , it is immediate from (4.36) that one has a bound of the type In particular, the argument of P Neu appearing in the last term on the right hand side of (4.35) is mapped continuously by P Neu into C 3 2 −3κ,(0) 3 , again with arbitrarily small norm when considering a short enough time interval. Furthermore, all of these expressions are locally Lipschitz continuous (with similar bounds) as a function of V ε in D 3 2 −3κ,(0) 3 and of the modelˆ ε , uniformly over ε ∈ [0, 1] which yields the first claim over a short enough time (but bounded from below independently of ε) interval. This can be maximally extended as usual, and the claim follows from the fact that we know a priori that solutions to (4.38) do not explode.
The second claim is straightforward by simply setting ε = 0 and applying the reconstruction operator to both sides of (4.35). The case of d = 2 is virtually identical, noting in particular that even though ε diverges in this case, it only does so logarithmically and is therefore compensated by the factor ε κ in (4.36). We leave the verification of the case d = 1 to the reader. Proposition 4.18. Let ς ε = ε α ξ ε ,ς ε = ε 2−d−κ σ ε , and defineˆ ε as in Section 4.3. Then, the assumptions of Proposition 4.16 are satisfied and we have ς ε ,ς ε → 0 in their respective spaces. Furthermore, for any ε > 0, the modelled distribution V ε constructed in Proposition 4.16 is such that R εVε coincides with the process v ε defined in (4.1).
Proof. We first show that the assumptions of Proposition 4.16 are satisfied. The fact that the random modelsˆ ε are uniformly bounded (in probability) as ε → 0 and converge in probability toˆ is the content of Theorem 4.8. By the second part of Assumption 2.1 combined with stationarity, we furthermore see that which converges to 0 in probability by Corollary 5.6. We also conclude from Corollary 5.6 and our definitions that, It remains to show that solutions coincide with v ε . This is a special case of the general result obtained in [2] and could in principle also be obtained in a way similar to [14]. We present a short derivation here in order to remain reasonably self-contained.
The powercounting of the various symbols appearing in our structure depends on the dimension, so we first restrict ourselves to the case d = 3, which is the one with the largest number of terms of negative degree appearing. Combining (4.35) with Lemmas 4.10 and 4.12, we conclude that if we take for V ε any solution to (4.35), there exist functions v and ∇v such that, for (0) ε and (1) ε as in Lemma 4.12, the following identities hold on R + × D: Developing the argument of P Neu 1 D + in (4.35) up to order 0, we conclude that it is given by (4.39) At this point, we apply the results of [2]. Comparing [2,Eq. 2.20] with [2,Def. 3.20] and [2,Thm. 3.25], we see that each term appearing on the right hand side generates a counterterm for the renormalised equation. Each of these terms is of the formF(v, ∇v, u (0) , ∇u (0) , u (1) , ∇u (1) )τ for some functionF and some symbol τ . The counterterm generated by any such term is then obtained precisely by simply replacing τ by the corresponding renormalisation constant and by interpreting the first two arguments ofF as the value and gradient of the actual solution (after reconstruction).
Remark 4.19. One may worry that we are not quite in the framework of [2] because of the special treatment ofV (0) ε andV (1) ε . This however is due to purely analytical reasons that only affect the boundary behaviour. The computation of the renormalisation terms on the other hand is a purely algebraic affair which is not affected by this. The boundary conditions of v ε however are affected by our decomposition and need to be determined separately.
It follows that, in dimension 3, the solution v ε = R εVε to the fixed point problem with the renormalised model satisfies in R + × D the PDE Furthermore, both R εVε and v ε have homogeneous boundary conditions, so that the boundary conditions of v ε coincide with those of v (1) By (4.36) and since we choseς = ε −1−κ σ ε , there is a cancellation betweeñ R ε (ς) and some of the other terms appearing in this equation. Since furthermore = 0, we obtain which, when combining with the definition of given in (1.6) and the fact that ς = ε α ξ ε , precisely coincides with (4.2). Since its initial condition and boundary condition coincide as well, this completes the proof of the claim. In dimension 2, a similar argument (but taking less terms into account) yields Again, the termR (2) ε precisely cancels the term proportional to ε σ ε , so that this again coincides with (4.2). In dimension 1, an even simpler argument shows that which again coincides with (4.2), noting that in this case one has u (1) = 0.

Convergence of Models
In order to show convergence of the models, we apply the general result of [6,Thm 2.31]. This result shows that if one considers the "BPHZ lifts" of a sequence of smooth and stationary stochastic processes ξ n as given in [5,Thm 6.17] then, provided that one has uniform bounds of a suitable "norm" of ξ n and under a few relatively weak additional algebraic assumptions, the resulting sequence of models converges to a limit, provided that ξ n → ξ weakly in probability.

Cumulant Homogeneity Assignments
In this section, we define and we often use z = (t, x) for space-time coordinates. The exponents α will always be chosen in c, d+2 2 . Our aim is to obtain a suitable bound independent of ε for joint cumulants of the form Given a finite collection of at least two space-time points z = {z a } a∈A , we again consider the corresponding labelled binary tree t z = (T, n) as in Section 2.1, with the leaves of T identified with the index set A. Recall that the nodes V T of T are given by subsets of A, with inner nodesV T given by subsets with at least two elements and the root of T given by A itself. Recall from [6, Def. A.14] the following definition: • If |A| ≥ 3, then for every u ∈V T with |u| ≤ 3, one has v∈V T :v⊂u c (α)

Remark 5.2. Applying the first two conditions with B = A and u = A respectively implies in particular that v∈V T c (α)
We will now display a consistent cumulant homogeneity such that, for every finite set A, space-time points {z a } a∈A , and choices α : A → [c, d+2 2 ], We have the bound We claim that one possible choice is obtained by setting Proof. The first two conditions of Definition 5.1 follow immediately from the structure of the formula (5.3), in particular the facts that 2 −d(u,a) is positive, vanishes for a ∈ u, and is such that u∈V T 2 −d(u,a) = 1. Regarding the last condition, the case |u| = 3 follows from the fact that the second condition holds and α a ≤ d+2 2 . The case |u| = 2 follows from the condition |A| ≥ 3 which guarantees that the corresponding sum is bounded by d+2 2 since d(u, a) = 1 in this case.    Assume now that |A| > 2 and fix a binary tree T over A. Write A 1 and A 2 for the children of A inV T , so that A = A 1 A 2 . We distinguish two cases. In the first case, |A 1 | ∧ |A 2 | ≥ 2, so that the tree T can naturally be thought of as two trees T 1 , T 2 over A 1 , A 2 joined by their roots. By (5.4) (in particular the fact that there is an additional −1 at the root), we then have for all other u ∈V T , with i ∈ {1, 2} depending on whether u ⊂ A 1 or u ⊂ A 2 . We conclude by using the induction hypothesis, which implies that c (α) In the second case, we have |A 1 | = 1 and |A 2 | ≥ 2 (or vice-versa), the case |A| = 2 having already been dealt with. In this case, the tree T consists of a subtree T 2 over A 2 as before, with an additional root vertex A and single extra leaf. In this case, we have whence we conclude as before. Proof. It follows from Assumption 2.1 and (5.1) that

(5.7)
We now note that (5.6) implies that for everyĉ ∈ [c, c], uniformly over n ∈ Z and ε ∈ (0, 1]. Inserting this into (5.7) immediately yields that, uniformly in ε, one has Since the map c (α) T is of the desired type by Lemma 5.4 [modulo the additional factor 2 at the root which is taken care of explicitly in (5.8)], the claim follows. Corollary 5.6. For any α ∈ c, d+2 2 , one has η (α) ε → 0 in probability in C β for every β < −α. In particular, η ε → 0 in probability in C β for every β < −1 and ζ ε → 0 in probability in C β for every β < d−2 2 . The same holds for η (α) ε 1 A for any fixed Borel set A.
Proof. The first statement is an immediate consequence of [6,Thm 2.31]. The fact that we can multiply η (α) ε by an arbitrary indicator function follows from the fact that these bounds do not involve the derivative of the test function in this case. Remark 5.8. Since our scaling and degree assignment are fixed throughout and since we consider all cumulants, that is we choose L Cum to contain all possible cumulants, we omit these from our notation.

Power-Counting Conditions
Proof. We reduce ourselves to considering symbols of negative degree since the claim then follows automatically for the remaining ones. These symbols are listed in (4.15). Note also that the definition of c-super regularity is non-trivial only for trees that contain at least three noises, so that it suffices to consider the symbols By (5.9), it is sufficient to verify that for every subtree τ containing k ≥ 2 instances of a noise one has the bound where i denotes the ith noise appearing in τ . This can be seen simply by inspection of the list (5.10).

Special Bounds
For some of the symbols in our regularity structure, we will bounds that are stronger than what is suggested by the degrees of the symbols in question. In this statement, φ denotes an arbitrary measurable function with sup z |φ(z)| ≤ 1, suppφ ⊂ B(0, 1). (5.11) Proposition 5.9. Let δ > 0 be as in Section 2. For α ∈ [−δ, 1] and β = − 1 2 − α, we have the bounds where we write X Y as a shorthand for the existence, for every p ≥ 1, of a constant C such that E|X | p ≤ CY p , uniformly over all λ ≤ 1, ε ≤ 1 and φ as in (5.11).
In particular, we can write and we can apply [6, Thm 2.31], showing that the BPHZ renormalisation of this term satisfies the required bounds. Recall thatˆ ε doesn't quite agree with the BPHZ renormalisation, but the error between the two is given by 2ε 3/2 δg ε ( ) ε with δg ε as in (4.19), which is easily seen to satisfy the required bounds.
The other two terms can be dealt with similarly by writing thus concluding the proof. Corollary 5.10. For every κ > 0, every α ∈ [0, 1), and every Borel set A ⊂ R d+1 , one has the bound E 1 A ξ ε Proof. This follows from the third bound of Proposition 5.9 by Kolmogorov's criterion, using the fact thatˆ ε = ε 3 ξ ε and that we can move the multiplication by 1 A onto the test function.

Tightness and Convergence for the Noise
In this section, we show that the convergence announced in Theorem 4.8 holds. As usual, convergence is obtained by first showing tightness and then identification of the limiting distribution. More precisely, we prove the following: Theorem 5.11. One has ξ ε → ξ weakly in C α for every α < − d+2 2 .
It turns out that this statement is a relatively straightforward consequence of the following proposition. Writing κ p (X ) for the pth cumulant of a real-valued random variable X , one has the following: Proposition 5.12. For p ≥ 2, we have the bound uniformly over all λ ∈ R + and all φ as in (5.11).
Before we turn to the proof of Proposition 5.12, let us show how to deduce Theorem 5.11 from it. First, we have the following corollary: Corollary 5.13. For any κ ≤ 1 2 and p ≥ 2, one has the bounds uniformly over λ ≤ 1, ε ≤ 1 and φ as in (5.11).
Proof. By simple rescaling, it is straightforward to see that the required bound is equivalent to the bound uniformly overλ ≤ 1/ε and ε ≤ 1. For even integers p ≥ 2, Proposition 5.12 implies that |κ p η, φ λ It remains to observe that (5.13) implies that E| η, φλ 0 | p λ −cp , uniformly over allλ > 0, for any c ∈ [c, d+2 2 ]. Since (5.12) is of this form and our assumptions guarantee that the values of c appearing there fall into the correct interval, this concludes the proof.
Proof of Theorem 5.11. Using [11, Eq. 10.4] (which is nothing but an analogue of Kolmogorov's continuity criterion) and the compactness of the embeddings C α ⊂ C β for α > β (over any bounded domain), it follows from Corollary 5.13 that the laws ξ ε are tight in C α for every α < − d+2 2 . It remains to show that every limit ξ of a convergent subsequence of ξ ε is spacetime white noise. It thus suffices to show that, for every φ ∈ C ∞ 0 with φ 2 (z) dz = 1, ξ(φ) is centred normal with variance 1. Since all moments of ξ ε , φ remain bounded as ε → 0 by Corollary 5.13, one has For p ≥ 3, this vanishes, thus showing that ξ(φ) is Gaussian. It clearly has zero mean since this is already the case for ξ ε , φ . Its variance is given by Note now that, for every ε > 0, every δ ∈ [0, 1] and any z,z ∈ R d+1 , one has the bound Recalling that φ 2 = 1 and that κ 2 (0, z) dz = 1 by (1.3), we thus obtain Here, we used the fact that the integral of φ 1/ε 0 is independent of ε, as well as our assumptions (1.3) and (2.1) on the covariance function of η. In particular, we choose δ as in the definition of c, see Section 2, whence 2c − δ > d + 2 which guarantees integrability at infinity. (Integrability at 0 is guaranteed by 2c − δ = d − δ < d + 2.) Proof of Proposition 5.12. We expand the expression for the cumulant as In order to bound this integral, we use a simplified version of the type of multiscale analysis used in [15,16]. Let us recall how this works. We now write t = (T, n) for a generic binary tree, together with a scale assignment as above (that is we enforce the fact that n is monotone) and T for the set of all such t. Given t ∈ T, we write V t /E t for the vertex/edge set of the corresponding tree and n t for the scale assignment. In particular, n t ( p ) denotes the scale assignment for the root, which controls the diameter of the set {z 1 , . . . , z p } in R d+1 . The number p will always be considered fixed, so we do not include it explicitly in our notation. Denoting by T the map T : (z 1 , . . . , z p ) → t ∈ T defined at the start of Section 2, we set D t = T −1 (t) for the set of all configurations of points z ∈ (R d+1 ) p giving rise to a given combinatorial data. Then, it was shown in [15,Lem. A.13] that, for every bounded Borel set U ∈ R d+1 one has the bound where | · | denotes Lebesgue measure. Furthermore, by construction, t∈T D t is of full Lebesgue measure.
Without loss of generality, we can restrict ourselves to the case where the support of φ has diameter bounded by 1 in the parabolic distance. With all of this notation at hand, we then bound (5.14) by We simplify this expression as follows. First, we note that as soon as 2 −n( p ) ≥ λ, since the support of φ λ 0 is bounded by λ, so that we can restrict the sum above to those t satisfying 2 −n( p ) ≤ λ. Furthermore, one has Combining this with Assumption 2.1, we conclude that

(c(A)−(d+2))n t (A) .
We treat separately the cases λ ≤ 1 and λ ≥ 1. In the former case, we can apply [15,Lem. A.10] with the distinguished vertex ν appearing there equal to the root p . The first condition appearing there is then satisfied by the fact that 2c − (d + 2) < 0 by assumption, while the second condition is empty and therefore trivially satisfied. Note that the shape T of the tree is fixed in [15,Lem. A.10], while we also sum over all possible shapes, but since there are finitely many of them for any fixed p, this just yields an additional prefactor. We thus obtain the bound SinceV t contains exactly p − 1 elements (the tree T is binary and has p leaves), we finally obtain as required.
The case λ > 1 is split into two subcases. If p ≥ 3, we use the fact that |κ p (z)| κ p,c (z) uniformly over all z, so that, using [15, Lem. A.10] as above, we have Note that, in order to be able to apply that result, we need to verify that, for every subtreet of t spanned by some subset of its leaves satisfies A∈Vˆt α(A) < 0, where otherwise. This is of course trivially satisfied as soon ast = t since c < d + 2 by assumption. The exponent at the root however is given by 2c−(d +2), which is positive, but since p ≥ 3, the sum of all exponents is given by 2c−(d +2) which is indeed negative for p ≥ 3 and δ < d+2 6 ,which we assume w.l.o.g.. It remains to consider the case p = 2 and λ > 1. In this case, the above computation reduces to as required, thus concluding the proof.
We also use the following convergence results which do not strictly speaking follow from Theorem 5.11 since multiplication by an indicator function (even that of a hypercube) is not a continuous operation on C β for β < 0: Lemma 5.14. Let ξ ε,D = ξ ε 1 R×D and ξ + ε,D = ξ ε 1 R + ×D . Then, ξ ε , ξ ε,D , and ξ + ε,D jointly weakly converge to limits ξ , ξ D and ξ + D .
Proof. The proof is identical to that of Theorem 5.11, using the fact that all bounds we used are uniform over test functions as in (5.11), so that multiplying them by the indicator function of some domain changes nothing.

Boundary Term
Recall that we have set where ∂ i,0 D = {x ∈ D : x i = 0} and ∂ i,1 D = {x ∈ D : x i = 1} and the constants are given by where the function Q i, j is defined as follows. For i = 1, . . . , d, write ι i : R×R d → R d+1 for the map given by where x (i) ∈ R d−1 denotes the vector obtained from x by deleting the ith coordinate. With this notation, we then set Remark 5.15. We will show in Lemma 5.21 that both the integrands in the definition of Q i, j and the functions Q i, j themselves are integrable, so that these expressions are all finite.
Remark 5. 16. The formula given above is valid for the case of Neumann boundary conditions. In the case of Dirichlet boundary conditions, a similar formula holds, but the precise values of the constants do not matter since they do not affect the solutions.
Note that although P • ι i = P, it does not depend on i, while κ 2 • ι i does depend on i in general, since we do not assume that the driving noise is isotropic. We henceforth writeP = P • ι i . One of the main results of this section is that the renormalised model on vanishes in a suitable sense as ε → 0. We first provide a bound on its expectation, which requires the bulk of the work. For the formulation of this result, we write B 1 0 for the set of all test functions φ ∈ C 1 0 with support contained in the parabolic ball of radius 1 and such that max{ φ ∞ , Dφ ∞ } ≤ 1.
Before we turn to the proof, we introduce a number of notations and preliminary bounds. Write G for the reflection group generated by "elementary" reflections across the 2d planes containing the faces of D. The group G is naturally identified with Z d (as a set, not as a group!) since for each k ∈ Z d there exists exactly one element R k ∈ G mapping k + D into D. We also write G R → (−1) R ∈ {−1, 1} for the group homomorphism mapping the elementary reflections to −1. We will write : R d+1 → R × D for the map such that Proof. We can assume that ε < c, so that |κ (ε) by Assumption 2.1, and the bound follows at once.
Proof of Proposition 5.17. We now consider the Neumann case, the Dirichlet case follows from a virtually identical calculation. We start by bounding the expectation of ˆ ε (φ λ z ). By the reflection principle (5.18), the correction due to the Neumann boundary conditions is given by It then follows from the definitions that Writing λ = λ ψ for the diameter of suppψ, we aim to give a bound of the form for any κ > 0 small enough, uniformly over ε, λ ≤ 1. We restrict ourselves to the case λ ψ ≤ 1 8 and we set We furthermore assume for the moment that (5.21) and that, whenever W ψ ∩ ∂ D = ∅, one has W ψ ⊂ D. We will see later that it is always possible to reduce oneself to this case. Finally, for x ∈ R d , we write |x| 0 for the number of vanishing coordinates of x and we set d ψ = sup{|x| 0 : (t, x) ∈ W ψ }.
We treat the different cases separately. The case d ψ = 0. In this case W ψ ∩ ∂ D = ∅, so that suppψ ∩ ∂ D = ∅ and in particular ˆ ε (ψ) = ε (ψ). We then have We break the inner integral into a finite sum of integrals over R × (k + D), since K has compact support and z ∈ R × D. Since we can restrict z to the support of ψ, we have |x − x | ≥ λ for all x ∈ k + D with k = 0 by the definition of W ψ . We can then apply Lemma 5.18 with U = R × (k + D) and = for the first term, while = id for the second term. It follows that |E ˆ ε (ψ)| is bounded by which is indeed bounded by the right hand side of (5.20). The case d ψ = 1. In this case, the support of ψ is located near one of the faces of D (say ∂ i,0 D), but its distance to the other faces is at least λ. Write R for the element of G which corresponds to reflection around the plane containing ∂ i,0 D and π i : R d+1 → R d+1 for the orthogonal projection onto that plane. We also write We first note that where |R| is bounded by (5.23). Indeed, the integrands in (5.19) and (5.24) vanish on D and coincide on RD. Their integral over the complement of these two regions is then bounded exactly as before by applying Lemma 5.18 to finitely many translates of D. By Lemma 5.19, we can furthermore replace K by P so that where |R| is bounded by (5.23). The reason for this is that P − K is supported outside of an annulus of radius 1 and ψ is supported at a distance of at most 1/4 from the reflection plane of R, and one has z − Rz ≥ 1/2 for all z, z with z ∈ suppψ and z − z ≥ 1.
On the other hand, we claim that To see that this is the case, we first perform the change of variables z = ι i (s, v) and similarly for z , and we note that π i ι i (s, v) = ι i (0, v), so that the right hand side is given by Performing the substitution s → −β and comparing to the definition of Q i,0 , we conclude that Note now that the integral over u is restricted to the projection of the support of ψ which is of volume at most λ d+1 (since u consists of d − 1 spatial variables and one time variable). Bounding ψ by its supremum, it follows that On the other hand, we can bound Combining these and choosing ψ = φ λ z so that ψ ∞ λ −d−2 and D x ψ ∞ λ −d−3 yields a bound of the order We claim that this case can be obtained as a consequence of the bounds for the cases d ψ = 0 and d ψ = 1. We consider the case of dimension d = 2 (but the computation below is done in a way that keeps track of dimension and applies to d = 3 as well) so that by (5.21) We then fix a smooth function χ : R → R + such that suppχ ⊂ [0, 2] and such that k∈Z χ k (x) = 1, where χ k (x) = χ(x − k). For some fixed constant c > 1 and integers n, k i and , we then set By choosing c sufficiently large, we can guarantee that, for every n, k and , the function χ n,k, is such that d χ n,k, ∈ {0, 1}. Furthermore, there are only finitely many values of k (independently of n and ) for which χ n,k, = 0. This is because χ n,k, (S 2 n z) is independent of n. Fix now a test function of the form ψ = φ λ z 0 and write ψ n,k, (z) = φ λ z 0 (z)χ n,k, (z).
By construction, one has ψ n,k, = 0 for n such that 2 −n ≥ 2λ, so that Applying the bounds we already obtained for d ψ ∈ {0, 1}, we conclude that For any given n, the number of values for k and leading to non-vanishing λ ψ n,k, is of the order of (λ2 n ) d , so that we eventually obtain the bound as claimed. The case of dimension d = 3 is identical, the only difference being that now has two components. Note that this calculation breaks in d = 4 where the sum over n diverges. This suggests that in this case one would have to add toˆ ε an additional correction term that charges faces of codimension 2. The case d ψ = 3. This is relevant only for d = 3, we shall however keep track of d in our calculation to illustrate how this would behave in higher dimensions. We then proceed in the same way as for the case d ψ = 2, making use this time of the fact that we already have the required bound for all test functions with d ψ < 3. This time, we have and we set in a fashion similar to above, This time, for any given n, the number of values for k and leading to non-vanishing λ ψ n,k, is of the order of (λ2 n ) d−1 , which then yields similarly to above Note that this time the series actually converges for all d < 6.
To conclude, we justify the assumption (5.21) and the fact that, for W ψ ∩ ∂ D = ∅, one has W ψ ⊂ D. Indeed, by our assumption on λ ψ , it is always possible to enforce this by applying a finite number of reflections around the planes {x : x i = 1/2}. If suppψ intersects ∂ i,1 D for example, then we reflect around x i = 1 2 to have suppψ intersect ∂ i,0 D instead. The only effect of this reflection is that, in order to obtain the same answer, we only need to reflect the noise η around that plane. The effect of this operation on its covariance function is to change the sign of the ith spatial coordinate of its argument, which is why how we obtain Q i,1 rather than Q i,0 in (5.26).
We now have the main ingredients in place to prove the main result of this section.
Proof. Writing ψ = φ λ z , the triangle inequality yields and we already obtained the required bound on the second term in Proposition 5.17, so we focus on the first one. Furthermore,ˆ ε differs from ε by a deterministic quantity, so that we only need to bound By (4.7) combined with (5.18), this random variable equals This time, we will not need to exploit the cancellation between these two terms on R × D, so we simply bound both terms separately. The second term equals ˆ ε (ψ), which is bounded by the right hand side of (5.27) by Theorem 4.8. For the first term, we use the fact that K is compactly supported, so that it can be bounded by a finite sum of terms of the type The expectation of the pth power of this expression is given by a multiple integral with the integrand given by a sum of terms, each of which is a product of heat kernels and of cumulants. At this stage, we note that the bound in [6] does not exploit any further cancellations, so we can put absolute values everywhere, bound K (z − Rz ) by z − z −d , and use the bounds from that paper.

Auxiliary Results
In this section, we collect a number of results that are more or less straightforward consequences of known results, specialised to our setting. Throughout this section, we assume that we are working with the regularity structure defined in Section 4.2 and that ξ ε is defined as in (4.1) and satisfies Assumption 2.1.
Proof. We aim to apply Corollary B.6. Let γ = 1 2 − 2κ, η = − 5 2 − κ (so that in particular γ − η = 3 − κ), and let B = C γ −η (on which C γ −η then acts canonically by multiplication) with the injection ιg = (L γ −η g) 1 + ∈ D γ,η . (This is actually independent of the model .) Given g ∈ B, we set which is consistent with the reconstruction operator by our assumption on ζ + . We are therefore in the setting of Corollary B.6 provided that we set K 0 = K Neu .
This guarantees that we can find ∈ C γ +2,η+2 with the desired properties. The continuity as a function of ζ + and the model follows from the corresponding continuity statement in Corollary B.6. Proposition 6.3. For every g ∈ C 2−κ one can find taking values in the Taylor polynomials such that, setting one has V ∈ D 2−2κ,w withw = 1 2 − 2κ, 1 2 − 2κ, 0 , and R ε V = K 1 + gˆ ε .
for some functions g 1 , g 2 ∈ C 2−κ and g 3 ∈ C 3 2 −κ,w with w = − 1 2 −κ, 1 2 −κ, 1 2 −κ . Then, there exists a unique modelled distribution which we call R( G) such that R( G) =R( G) on test functions whose support does not intersect R × ∂ D and such that locally uniformly over x ∈ R × D and uniformly over λ ∈ (0, 1]. Furthermore, there exists V ∈ D 2−2κ,w withw = 1 2 − 2κ, 1 2 − 2κ, 0 taking values in the translation invariant sector, and such that RV = K R( G). The function V is of the form with taking values in the Taylor polynomials, and the map (g 1 , g 2 , g 3 , ) → V is uniformly Lipschitz continuous on bounded sets.
Proof. We use again the same strategy of proof as in Proposition 6.3, but this time we take as our space B the space of triples g = (g 1 , g 2,i , g 3 ) as in the statement of the proposition and we set where R is the reconstruction operator given by Theorem C.5. This time, we have deg = − 3 2 − 2κ and deg = −1 − κ, so that it follows from [9, Lem. 4.3] that ιg, G ∈ D γ,w for γ = 1 2 − 2κ and w = − 3 This shows that Theorem C.5 can indeed be applied to this situation since furthermore our admissible models are such thatˆ ε ( τ )(φ) = 0 as soon as φ is supported outside of R × D. The remainder of the proof then follows from an application of [9,Lem. 4.12] in the same way as in the proof of Proposition 6.3. Remark 6.5. Note that the results of [9] do not apply here since deg and deg are both strictly below −1. Our saving grace is that the coefficients are sufficiently well-behaved near the boundary of the domain.
Proof. Recall thatK n is supported in the ball of radius 2 −n and note that z − R(z ) ≥ z − z whenever both z and z belong to R× D. As a consequence, for every z, z ∈ ([0, 1]× D) 2 , every n ≥ 0 and every R ∈ G, one has eitherK n (z − R(z )) = 0 or φ(2 n z − z ) = 1, so that one can replace φ by 1 in (A.1).
In order to state the main result of this appendix, we writeD = R + × D as a shorthand.
Proposition A.2. Let K , K ∂ , K i and K ∂,i be as above and let ζ, ζ c ∈ C α with α ≤ −2 be such that ζ is supported onD and ζ c is supported on its complement. Then, restricted toD, both K ζ c and K ∂ ζ belong to C γ,w for w = α + 2 3 and any γ > 0. Furthermore, when restricted toD c , K ζ belongs to C γ,w . The same statements hold when α ≤ −3 with K and K ∂ replaced by K i and K ∂,i respectively and w = α + 3 3 .
Proof. Consider the case z ∈D, so that We now note that ζ, D kK n (z − ·) = 0 whenever 2 −n ≤ d(z,D), while in general by definition of C α . It immediately follows that one has so that one has indeed K ζ ∈ C γ,(α+2) 3 for every γ > 0. The case of K ζ c , K i ζ and K i ζ c is dealt with in exactly the same way. Consider now the case z ∈D and definẽ We then have the identity It follows again from the definition of C α that the bound (A.2) holds withK n (z − ·) replaced byK R n (z, ·). We also note thatK R n (z, ·) = 0 unless there exists a point z such that one has on one hand z − z 2 −n and on the other hand z − R(z ) 2 −n . In particular, there exists a constant C such that, if d(z, ∂D) ≥ C2 −n , one hasK R n (z, ·) = 0 unless R = id (which is excluded from the sum (A.4)). If on the other hand d(z, ∂D) ≤ C2 −n , thenK R n (z, ·) is only non-zero for at most eight different reflections R. Combining these observations shows as before that the bound (A.3) holds with K ζ replaced by K ∂ ζ . The proof for K ∂ replaced by K ∂,i is virtually identical. Corollary A.3. For α ∈ (−3, −2] and ζ ∈ C α supported on R + × D, one has K ∂ ζ ∈ C α+2 . Proof. This follows immediately from Proposition A.2, combined with the fact that C γ,(η) 3 ⊂ C η whenever γ ≥ η > −1.
Proof. With the same notations as above, we have for z = (t, x), As before, the summand vanishes as soon as 2 −n √ |t| and, for all z ∈ R + × R d and n ≥ 0, only at most a fixed number of test functionsK R n (z, ·) are non-vanishing, so that we obtain for t > 0 the bound This then implies the required bound by the definition of the spaces C γ,w .

Appendix B: Integration and Multiplication by Smooth Functions Almost Commute
We will use a form of the multilevel Schauder theorem of [11,Sec. 5] with slightly weaker assumptions on the kernel K . In this section, we fix a space-time scaling s as in [11]. (In our case this would be the parabolic scaling s = (2, 1, . . . , 1).) where the functionsK n have the following properties: • For all n ≥ 0, the mapK n is supported in the set {(x, y) : x − y s ≤ 2 −n }.
• For any two multiindices k and , there exists a constant C such that the bound We assume that we are given a regularity structure of the type studied in [5,11], endowed with a family of integration maps I k for multiindices k ∈ N d . We also assume that we are given a model ( , ) which is admissible for the collection of kernels K k such that

(B.3)
We will furthermore assume that our regularity structure contains the polynomial structure on R d (for some fixed scaling) and that admissible models satisfy the usual identity x X k τ = ( • − x) k x τ for every τ ∈ T . Assuming that I 0 is of order β, we assume that I k is of order β + |k| s , which is compatible with (B.3) in the sense that if K 0 is β-regularising in the sense of Definition B.1, then K k is (β + |k| s )-regularising.
It follows that for any fixed values of q and r the above sum vanishes, except when r −q = 0, so that it equals constrained by |q| s < α + β + | | s , which is precisely the integrand appearing in the second term of (B.5).
The second identity immediately follows from the first one. Indeed, both sides take values in the Taylor polynomials by the definition of an admissible model. Furthermore, the first identity implies that

(B.6)
As a consequence of combining this with the first identity of (B.4), both sides of the second identity of (B.4) are equal after applying x to them. Since furthermore both sides belong to the space of Taylor polynomials on which x is injective, the claim follows.
Corollary B.4. In the context of Lemma B.3, write K k for the integration operator associated to I k as in [11] and write L γ for the Taylor lift C γ → D γ . Then, for every F ∈ D γ χ with χ ≤ 0 < γ and g ∈ C θ−χ with θ ∈ (0, γ ] (and such that θ + β ∈ Z), one can find a function φ ∈ C θ+β such that, setting one has G ∈ D θ+β and RG = K (gRF).
Proof. A straightforward calculation virtually identical to (B.6) shows that, as a consequence of the first identity of Lemma B.3, one has
Since furthermoreḠ and each of the terms L θ−χ −| | s (D g) K F belong to D θ+β , we conclude that one necessarily has G ∈ D θ+β . This in turn implies that ∈ D θ+β , so that it is the lift of a function φ ∈ C θ+β . By (B.9), we furthermore have RG = RḠ = K (gRF), thus concluding the proof.
One simple but very useful corollary of this result can be formulated as follows: Corollary B.5. Let ζ ∈ C χ with χ ≤ 0, let K and K k be as above, and let g ∈ C θ−χ with θ > 0 and θ + β ∈ Z. Then, Proof. It suffices to consider the case of a regularity structure with symbol (plus Taylor polynomials and their products with ) and model mapping to ζ . We then apply the reconstruction operator to both sides of (B.7) with F = .
In our context, we will need an analogous result, but for the spaces D γ,η of [11,Sec. 6]. Furthermore, we will need to be able to cover situations in which [11,Prop. 6.16] does not apply because we consider elements taking values in a sector of regularity below −2, so that the reconstruction theorem [11, Prop. 6.9] fails. We therefore make use instead of [9,Lem. 4.12] which, given F ∈ D γ,η (V ) with V of regularity α ≥ η, allows us to specify a distribution ζ ∈ C η such that (RF)(φ) = ζ(φ) for every test function φ whose support does not intersect the plane P = {(t, x) : t = 0}. We then have the following result.
Corollary B.6. Let γ > 0, let V be a sector of regularity α ≤ 0, and let w = (η, σ, μ) with η, σ, μ ≤ α and η ≤ σ ∧ μ, η + β > −2, σ + β > −1. For each admissible model , let B be a Banach space equipped with a bounded map ι : B → D γ,w (V ). (Since the latter depends on in general, this can also be the case for B and/or ι. In this case, we assume that ι is bounded independently of the underlying model.) Let furthermoreR : B → C η be a continuous linear operator such that (RF)(φ) = (RιF)(φ) for every F ∈ B and every test function φ whose support does not intersect the two boundaries. We furthermore assume that we have a continuous bilinear map such that ι(gF) = L γ −η (g)ιF and such that where the right hand side is meaningful thanks to the fact that γ > 0. Then, withγ andw as in [9,Lem. 4.12], one can find φ ∈ Cγ ,w such that the modelled distribution G given by (B.7) (with suitably defined K , see the proof) belongs to Dγ ,w and satisfies RG = K 0 (gRF).
If furthermore one has a sequence of models n → with associated subspaces B n ⊂ D γ,w (V ) and reconstruction operatorsR n , as well as a sequence F n ∈ B n such that |||F n ; F||| γ,w → 0 and R n F n −RF η → 0, then the sequence of modelled distributions G n constructed in the first part of the statement converges to G in the sense that |||G n ; G|||γ ,w → 0.
Proof. The proof is virtually identical to that of Corollary B.4. The only difference is that we use [9,Lem. 4.12] to define maps K k : B → Dγ k ,w k (withγ k andw k defined likeγ and w, but with β replaced by β + |k| s ) such that RK k F = K kR F. (Since η + β > −2 by assumption, the assumptions of that lemma are satisfied and RK k F is always well-defined as a distribution on the whole space-time.)

Appendix C: Reconstruction Theorem
In this section, we present a version of the reconstruction theorem that allows to bypass to some extent the condition ν > −1 appearing in [9]. This appendix was written in collaboration with Máté Gerencsér. In this section we assume that P 0 = {0} × R d , P 1 = R × ∂ D, and P = P 0 ∪ P 1 , and we write |x| P i for the parabolic distance between x and P i . Generic points x, y, etc are space-time points. Throughout this section, we fix a regularity structure (T, G) containing the polynomial structure and such that the product with elements of the polynomial structureT is welldefined in T . We also only consider models such that x,y acts onT by translations by y − x. Let us recall from [11,Sec 8] that, given any regularity structure (T, G), a model ( , ) can alternatively be described by a linear map : T → D , together with a continuous map F : R d → G such that, setting x = F x , xy = (F x ) −1 F y , the analytic bounds for models are satisfied. The main assumption we impose on our models in the present setting is the following: Assumption C.1. Setting T < = α≤1 T α , there exist linear maps + , − : T < → D with + + − = T < and such that • + τ (ψ) = 0 for all τ ∈ T < and all ψ supported in R × D c and − τ (ψ) = 0 for all ψ supported in R × D; • Setting + x = + F x , − x := − F x the pairs ( + , ) and ( − , ) are models on (T < , G) in the sense of [11]. (But they are not admissible in general!) We will always write α for the lowest degree appearing in our ambient regularity structure T . We also fix γ > 0 as well as exponents η and σ on which we make the following assumption: Assumption C.2. The exponents satisfy the condition 0 > σ > −1 α ≥ η > −2.

(C.1)
We also use the shorthand w = (η, σ, η) similarly to [9] (except that we make the simplifying assumption that the "corner exponent" coincides with η which is not essential but simplifies our argument). One crucial ingredient for our result is the following: Lemma C. 3. For every f ∈ D γ,w , there exist f ± ∈ D σ,η such that f + (x) = f (x) for x ∈ R × D and f − (x) = f (x) for x ∈ R × D c .
Proof. It follows from the definition of the spaces D γ,w that the restriction of Q <σ f to either R ×D or R × D c belongs to D σ,η . In particular, components of f of degree below σ can be extended continuously to (R\{0}) × D. (Note that σ < 0 though!) The claim then follows from an adaptation of Whitney's extension theorem to the setting of regularity structures, see for example [18,Thm 5.3.16].
Remark C. 4. In general there is no reason for f + and f − to coincide on R × ∂ D.
We define spaces C α,η with η ≤ α < 0 as consisting of those distributions ζ ∈ C η (R 1+d ) ∩ C α (R 1+d \P 0 ) such that uniformly over x in compacts away from P 0 , 2λ ∈ (0, 2 ∧ |x| P 0 ], and test functions ψ ∈ B. Here and below we write B for the set of functions supported in the centred (parabolic) ball of radius 1 and with r derivatives bounded by 1, where r is some fixed sufficiently large value. Note that if η > −2, ζ ∈ C α,η is uniquely determined by its action on test functions supported outside P 0 , see [11].
Remark C.6. We did not specify the continuity of the map R with respect to different models. We will continue to omit continuity statements in the sequel, on one hand for the sake of easing the presentation, and on the other hand due to the fact that since all the operations discussed here are linear, the "linearisation trick" of [14,Prop 3.11] automatically implies all the required continuity properties.
Proof. The proof is very similarly to that of [9, Thm 4.10], but due to the central role of the statement in our proof, we provide some detail. The uniqueness part is quite straightforward: take two C α,η distributions ξ 1 , ξ 2 that have the properties claimed for R f in the theorem. Their difference then vanishes away from P, and thanks to the bound (C.2), must belong to C α,η . Since η > −2 however, such a distribution necessarily vanishes.
To construct R, we use essentially the same construction as in the proof of [11,Prop. 6.9].
Similarly to the construction of the functions φ x,n performed there, we can find, for every n ∈ N, a countable index set n and functions φ x,n with n ∈ N and x ∈ n with the following properties. There exist constants c i > 0 such that: (i) For every n ∈ N and x ∈ n there exists ψ ∈ B, y ∈ R d+1 with |y| P 1 = 2 −n such that, setting λ = 2 −n−1 , one has φ x,n = c 1 λ d+2 ψ λ y . (ii) For every ball B of (parabolic) radius λ, there exist at most c 2 λ d+1 elements x ∈ n with suppφ x,n ∩ B = ∅. (iii) For every y ∈ R d+1 with 0 < |y| P 1 ≤ 1, one has n∈N x∈ n φ x,n (y) = 1.
(Note that the sum appearing in the last claim always converges since, by the first two properties, it is guaranteed to only contain finitely many terms.) We now write R ± for the reconstruction operators for D γ spaces with γ < 0 associated to the models ± as in the second part of [11,Thm 3.10], we fix y ∈ R d+1 \P 0 , λ ≤ 1 ∧ |y| P 0 /C for some large enough (but fixed) constant C, and ψ ∈ B, and we define