Three-dimensional stochastic cubic nonlinear wave equation with almost space-time white noise

We study the stochastic cubic nonlinear wave equation (SNLW) with an additive noise on the three-dimensional torus $\mathbb{T}^3$. In particular, we prove local well-posedness of the (renormalized) SNLW when the noise is almost a space-time white noise. In recent years, the paracontrolled calculus has played a crucial role in the well-posedness study of singular SNLW on $\mathbb{T}^3$ by Gubinelli, Koch, and the first author (2018), Okamoto, Tolomeo, and the first author (2020), and Bringmann (2020). Our approach, however, does not rely on the paracontrolled calculus. We instead proceed with the second order expansion and study the resulting equation for the residual term, using multilinear dispersive smoothing.

1. Introduction 1.1. Singular stochastic nonlinear wave equation. In this paper, we study the following Cauchy problem for the stochastic nonlinear wave equation (SNLW) with a cubic nonlinearity on the three dimensional torus T 3 = (R/(2πZ)) 3 , driven by an additive noise: where ξ(x, t) denotes a (Gaussian) space-time white noise on T 3 × R with the space-time covariance given by E ξ(x 1 , t 1 )ξ(x 2 , t 2 ) = δ(x 1 − x 2 )δ(t 1 − t 2 ) and φ is a bounded operator on L 2 (T 3 ). Our main goal is to present a concise proof of local well-posedness of (1.1), when φ is the Bessel potential of order α: for any α > 0. Namely, we consider (1.1) with an "almost" space-time white noise.
Given α ∈ R, let φ = φ α be as in (1.2). Then, a standard computation shows that the stochastic convolution: belongs almost surely to C(R; W s,∞ (T 3 )) for any s < α − 1 2 . See Lemma 3.1 below. Here, we adopted Hairer's convention to denote stochastic terms by trees; the vertex " " in corresponds to the random noise φξ = ∇ −α ξ, while the edge denotes the Duhamel integral operator: corresponding to the forward fundamental solution to the linear wave equation. Note that when α > 1 2 , the stochastic convolution is a function of positive (spatial) regularity α − 1 2 − ε. 1 Then, by proceeding with the first order expansion: u = + v and studying the equation for the residual term v = u − , we can show that (1.1) is locally well-posed, when α > 1 2 . See [13,58] in the case of the deterministic cubic nonlinear wave equation (NLW): with random initial data. Furthermore, by controlling the growth of the H 1 -norm of the residual term v via a Gronwall-type argument, we can prove global well-posedness of (1.1), when α > 1 2 . 2 See [13].
When α ≤ 1 2 , solutions to (1.1) are expected to be merely distributions of negative regularity α − 1 2 − ε, inheriting the regularity of the stochastic convolution, and thus we need to consider the renormalized version of (1.1), which formally reads (1.6) where the formal expression u 3 − ∞ · u denotes the renormalization of the cubic power u 3 . In the range 1 4 < α ≤ 1 2 , a straightforward computation with the second order expansion: u = − + v yields local well-posedness of the renormalized SNLW (1.6) (in the sense of Theorem 1.1 below). Here, the second order process is defined by where denotes the renormalized version of 3 . See [51] for this argument in the context of the deterministic renormalized cubic NLW (1.5) with random initial data. We state our main result. More precisely, given N ∈ N, let ξ N = π N ξ, where π N is the frequency projector onto the spatial frequencies {|n| ≤ N } defined in (1.13) below. Then, there exists a sequence of time-dependent constants {σ N (t)} N ∈N tending to ∞ (see (1.16) below ) such that, given small ε = ε(s) > 0, the solution u N to the following truncated renormalized SNLW : converges to a non-trivial 3 stochastic process u ∈ C([−T, T ]; H α− 1 2 −ε (T 3 )) almost surely, where T = T (ω) is an almost surely positive stopping time.
Stochastic nonlinear wave equations have been studied extensively in various settings; see [15,Chapter 13] for the references therein. In particular, over the last few years, we have witnessed a rapid progress in the theoretical understanding of nonlinear wave equations with singular stochastic forcing and/or rough random initial data; see [57,25,26,27,51,53,47,50,66,19,54,45,56,20,55,48,12,49]. In [26], Gubinelli, Koch, and the first author studied the quadratic SNLW on T 3 : (1.8) By adapting the paracontrolled calculus [24], originally introduced by Gubinelli, Imkeller, and Perkowski in the study of stochastic parabolic PDEs, to the dispersive setting, the authors of [26] reduced (1.8) into a system of two unknowns. This system was then shown to 3 Here, non-triviality means that the limiting process u is not zero or a linear solution. As we see below, the limiting process u admits a decomposition u = − + v, where the residual term v satisfies the nonlinear equation (1.25). See Remark 1.4 (ii) on a triviality result for the unrenormalized equation. See also [30,47,51,54] for related triviality results. be locally well-posed by exploiting the following two ingredients: (i) multilinear dispersive smoothing coming from a multilinear interaction of random waves (see also [45,12]) and (ii) novel random operators (the so-called paracontrolled operators) which incorporate the paracontrolled structure in their definition. These random operators are used to replace commutators which are standard in the parabolic paracontrolled approach [14,40].
More recently, Okamoto, Tolomeo, and the first author [48] and Bringmann [12] independently studied the following SNLW with a cubic Hartree-type nonlinearity: 4 where V is the kernel of the Bessel potential ∇ −β of order β > 0. 5 In [48], the authors proved local well-posedness for β > 1 by viewing the nonlinearity as the nested bilinear interactions and utilizing the paracontrolled operators introduced in [26]. In [12], Bringmann went much further and proved local well-posedness of (1.9) for any β > 0. The main strategy in [12] is to extend the paracontrolled approach in [26] to the cubic setting. The main task is then to study regularity properties of various random operators and random distributions. This was done by an intricate combination of deterministic analysis, stochastic analysis, counting arguments, the random matrix/tensor approach by Bourgain [9,10] and Deng, Nahmod, and Yue [18], and the physical space approach via the (bilinear) Strichartz estimates due to Klainerman and Tataru [36], analogous to the random data Cauchy theory for the nonlinear Schrödinger equations on R d as in [2,3,4]. From the scaling point of view, the cubic SNLW (1.6) with a slightly smoothed space-time white noise (i.e. small α > 0) is essentially the same as the Hartree SNLW (1.9) with small β > 0. Hence, Theorem 1.1 is expected to hold in view of Bringmann's recent result [12]. The main point of this paper is that we present a concise proof of Theorem 1.1 without using the paracontrolled calculus. In the next subsection, we outline our strategy.
Due to the time reversibility of the equation, we only consider positive times in the remaining part of the paper.
Remark 1.2. The equations (1.1) and (1.6) indeed correspond to the stochastic nonlinear Klein-Gordon equations. The same results with inessential modifications also hold for the stochastic nonlinear wave equation, where we replace the linear part in (1.1) and (1.6) by ∂ 2 t u − ∆u. In the following, we simply refer to (1.1) and (1.6) as the stochastic nonlinear wave equations. main goal of [48] is to study the focusing problem, in particular the (non-)construction of the focusing Gibbs measure associated to the focusing Hartree SNLW. They identified the critical value β = 2 and proved sharp global well-posedness of the focusing problem (with a small coefficient in front of the nonlinearity when β = 2). On the other hand, the main goal in [12] is the construction of global-in-time dynamics in the defocusing case, where there was a significant difficulty in adapting Bourgain's invariant measure argument [8,9]. This is due to (i) the singularity of the associated Gibbs measure with respect to the base Gaussian free field for 0 < β ≤ 1 2 [48,11] and (ii) the paracontrolled structure imposed in the local theory, which must be propagated in the construction of global-in-time solutions. See the introductions of [48,12] for further discussion. Remark 1.3. Our argument also applies to the deterministic (renormalized) cubic NLW on T 3 with random initial data of the form: where the series {g n } n∈Z 3 and {h n } n∈Z 3 are two families of independent standard complexvalued Gaussian random variables conditioned that g n = g −n , h n = h −n , n ∈ Z 3 . In particular, Theorem 1.1 provides an improvement of the main result (almost sure local well-posedness) in [51] from α > 1 4 to α > 0. Remark 1.4. (i) The first part of the statement in Theorem 1.1 is merely a formal statement in view of the divergent behavior σ N (t) → ∞ for t = 0. In the next subsection, we provide a precise meaning to what it means to be a solution to (1.6) and also make the uniqueness statement more precise. See Remark 1.9. (ii) In the case of the defocusing cubic SNLW with damping: , a combination of our argument with that in [47] yield the following triviality result. Consider the following truncated (unrenormalized) SNLW with damping: where ξ N = π N ξ. As we remove the regularization (i.e. take N → ∞), the solution u N converges in probability to the trivial function u ∞ ≡ 0 for any (smooth) initial data (u 0 , u 1 ). See [47] for details. Remark 1.5. (i) In our proof, we use the Fourier restriction norm method (i.e. the X s,b -spaces defined in (2.8)), following [57,12]. While it may be possible to give a proof of Theorem 1.1 based only on the physical-side spaces (such as the Strichartz spaces) as in [25,26,27], we do not pursue this direction since our main goal is to present a concise proof of Theorem 1.1 by adapting various estimates in [12] to our current setting. Note that the use of the physical-side spaces would allow us to take the initial data (u 0 , u 1 ) in the critical space H 1 2 (T 3 ) (for the cubic NLW on T 3 ). See for example [25]. One may equally use the Fourier restriction norm method adapted to the space of functions of bounded p-variation and its pre-dual, introduced and developed by Tataru, Koch, and their collaborators [37,28,31], which would also allow us to take the initial data (u 0 , u 1 ) in the critical space H 1 2 (T 3 ). See for example [3,46] in the context of the nonlinear Schrödinger equations with random initial data. Since our main focus is to handle rough noises (and not about rough deterministic initial data), we do not pursue this direction.
It would be of interest to extend Theorem 1.1 to a general Hilbert-Schmidt operator φ, say from L 2 (T 3 ) to H α− 3 2 (T 3 ) as in [16,52,44]. 6 Note that our argument uses the independence of the Fourier coefficients of the stochastic convolution but that such independence will be lost for a general Hilbert-Schmidt operator φ. 6 Or a general γ-radonifying operator φ as in [21], where the authors proved local well-posedness of the one-dimensional stochastic cubic nonlinear Schrödinger equation with an almost space-time white noise. Remark 1.6. (i) When α = 0, SNLW (1.6) with damping corresponds to the so-called canonical stochastic quantization equation 7 for the Gibbs measure given by the Φ 4 3 -measure on u and the white noise measure on ∂ t u. See [60]. In this case (i.e. when α = 0), our approach and the more sophisticated approach of Bringmann [12] for (1.9) with β > 0 completely break down. This is a very challenging problem, for which one would certainly need to use the paracontrolled approach in [26,48,12] and combine with the techniques in [18].
(ii) As mentioned above, when α > 1 2 , the globalization argument by Burq and Tzvetkov [13] yields global well-posedness of SNLW (1.1) with φ as in (1.2). When α = 0, we expect that (a suitable adaptation of) Bourgain's invariant measure argument would yield almost sure global well-posedness once we could prove local well-posedness of (1.10) (but this is a very challenging problem). It would be of interest to investigate the issue of global well-posedness of (1.6) for 0 < α ≤ 1 2 . See [27,66] for the global well-posedness results on SNLW with an additive space-time white noise in the two-dimensional case.

1.2.
Outline of the proof. Let us now describe the strategy to prove Theorem 1.1. Let W denote a cylindrical Wiener process on L 2 (T 3 ): 8 where e n (x) = e in·x and {B n } n∈Z 3 is defined by B n (t) = ξ, 1 [0,t] · e n x,t . Here, ·, · x,t denotes the duality pairing on T 3 × R. As a result, we see that {B n } n∈Z 3 is a family of mutually independent complex-valued Brownian motions conditioned so that B −n = B n , n ∈ Z 3 . In particular, B 0 is a standard real-valued Brownian motion. Note that we have, for any n ∈ Z 2 , Var(B n (t)) = E ξ, 1 [0,t] · e n x,t ξ, 1 [0,t] · e n x,t = 1 [0,t] · e n 2 L 2 x,t = t.
With this notation, we can formally write the stochastic convolution = I( ∇ −α ξ) in (1.3) as where ∇ = √ 1 − ∆ and n = 1 + |n| 2 . We indeed construct the stochastic convolution in (1.11) as the limit of the truncated stochastic convolution N defined by for N ∈ N, where π N denotes the (spatial) frequency projector defined by (1.13) A standard computation shows that the sequence { N } N ∈N is almost surely Cauchy in 9 C([0, T ]; W α− 1 2 −,∞ (T 3 )) and thus converges almost surely to some limit, which we denote by , in the same space. See Lemma 3.1 below.
We then define the Wick powers N and N by 14) and the second order process N by where I denotes the Duhamel integral operator in (1.4). Here, σ N (t) is defined by 10 (1.16) We point out that a standard argument shows that N and N converge almost surely to in C([0, T ]; W 2α−1−,∞ (T 3 )) and to in C([0, T ]; W 3α− 3 2 −,∞ (T 3 )), respectively, but that we do not need these regularity properties of the Wick powers and in this paper.
As for the second order process N in (1.15), if we proceed with a "parabolic thinking", 11 then we expect that N has regularity 12 3α − 1 2 − = (3α − 3 2 −) + 1, which is negative for α ≤ 1 6 . In the dispersive setting, however, we can exhibit multilinear smoothing by exploiting multilinear dispersion coming from an interaction of (random) waves. In fact, by adapting the argument in [12] to our current problem, we can show an extra ∼ 1 2 -smoothing for N , uniformly in N ∈ N, and for the limit = I( ) = lim N →∞ N and thus they have positive regularity. See Lemma 3.1. As in [26,12], such multilinear smoothing plays a fundamental role in our analysis.
Let us now start with the truncated renormalized SNLW (1.7) and obtain the limiting formulation of our problem. By proceeding with the second order expansion: we rewrite (1.7) as Hereafter, we use a− (and a+) to denote a − ε (and a + ε, respectively) for arbitrarily small ε > 0. If this notation appears in an estimate, then an implicit constant is allowed to depend on ε > 0 (and it usually diverges as ε → 0). 10 In our spatially homogeneous setting, the variance σN (t) is independent of x ∈ T 3 . 11 Namely, if we only take into account the (uniformly bounded in N ) regularity 3α − 3 2 − of N and one degree of smoothing from the Duhamel integral operator I without taking into account the product structure and the oscillatory nature of the linear wave propagator. 12 By "regularity", we mean the spatial regularity s of N as an element in C([0, T ]; W s,∞ (T 3 )), uniformly bounded in N ∈ N.
where we used (1.14). The main problem in studying singular stochastic PDEs lies in making sense of various products. In this formal discussion, let us apply the following "rules": • A product of functions of regularities s 1 and s 2 is defined if s 1 + s 2 > 0. When s 1 > 0 and s 1 ≥ s 2 , the resulting product has regularity s 2 .
• A product of stochastic objects (not depending on the unknown) is always well defined, possibly with a renormalization. The product of stochastic objects of regularities s 1 and s 2 has regularity min(s 1 , s 2 , s 1 + s 2 ).
We postulate that the unknown v has regularity 1 2 +, 13 which is subcritical with respect to the standard scaling heuristics for the three-dimensional cubic NLW. In order to close the Picard iteration argument, we need all the terms on the right-hand side of (1.18) to have regularity − 1 2 +. With the aforementioned regularities of the stochastic terms N , N , and N and applying the rules above, we can handle the products on the right-hand side of (1.18), giving regularity − 1 2 +, except for the following terms (for small α > 0): As for the first term N N v N , we first use stochastic analysis to make sense of N N with regularity α − 1 2 −, uniformly in N ∈ N, (see Lemma 3.3) and then interpret the product as Note that the right-hand side is well defined since the sum of the regularities is positive: The last product N N in (1.19) makes sense but the resulting regularity is 2α − 1−, smaller than the required regularity − 1 2 +, when α is close to 0. As for the second term in (1.19), it depends on the unknown v N and thus the product does not make sense (at this point) since the sum of regularities is negative (when α > 0 is small).
As we see below, by studying the last two terms in (1.19) under the Duhamel integral operator I, we can indeed give a meaning to them and exhibit extra ( 1 2 +)-smoothing with the resulting regularity 1 2 + (under I), which allows us to close the argument. By writing (1.18) with initial data (u 0 , u 1 ) in the Duhamel formulation, we have (without a renormalization). By exploiting random multilinear dispersion, we show that 13 As for the unknown v, we measure its regularity in (the local-in-time version of) the X s, 1 2 + -norm.
• the random operator I N maps functions of regularity 1 2 + to those of regularity 1 2 + (measured in the X s,b -spaces) with the operator norm uniformly bounded in N ∈ N and I N converges to some limit, denoted by I , as N → ∞. We study the random operator I N via the random matrix approach [9,10,59,18,12]. 14 See Lemma 3.5.
• the third order process N has regularity 1 2 + (measured in the X s,b -spaces) with the norm uniformly bounded in N ∈ N and N converges to some limit, denoted by , as N → ∞. See Lemma 3.4. We deduce these claims as corollaries to Bringmann's work [12]. In [12], the smoothing coming from the potential V = ∇ −β in the Hartree nonlinearity (V * u 2 )u played an important role. In our problem, this is replaced by the smoothing ∇ −α on the noise and we reduce our problem to that in [12], essentially by the following simple observation: for any γ ≥ 0. Remark 1.7. In the following, we also set (1.24) By carrying out analysis analogous to (but more involved than) that for N N studied in Lemma 3.3 below, we can show that ) almost surely, thus converging to some limit 2 . In this paper, however, we proceed with space-time analysis as in [12]. Namely, we study N in the X s,b -spaces and show that it converges to some limit denoted by . See Lemma 3.4.
Putting everything together, we can take N → ∞ in (1.20) and obtain the following limiting equation  14 We also mention a recent preprint [61], where the random matrix approach is also used to prove probabilistic local well-posedness of the Zakharov-Yukawa system on the two-dimensional torus T 2 .
where v is the solution to (1.25).
Remark 1.8. In terms of regularity counting, the sum of the regularities in · v 2 is positive. In the parabolic setting, one may then proceed with a product estimate. In the current dispersive setting, however, integrability of functions plays an important role and thus we need to proceed with care. See Lemmas 2.7 and 3.6. Remark 1.9. (i) By the use of stochastic analysis, the stochastic terms , , , , , and I in the enhanced data set are defined as the unique limits of their truncated versions. Furthermore, by deterministic analysis, we prove that a solution v to (1.25) is pathwise unique in an appropriate class. Therefore, under the decomposition u = − + v, the uniqueness of u refers to (a) the uniqueness of and as the limits of N and N and (b) the uniqueness of v as a solution to (1.25).
(ii) In this paper, we work with the frequency projector π N with a sharp cutoff function on the frequency side. It is also possible to work with smooth mollifiers η δ ( is a smooth, non-negative, even function with ηdx = 1 and supp η ⊂ (−π, π] 3 T 3 . In this case, working with we can show that a solution u δ to (1.27) converges in probability to some limit u in ) as δ → 0. Furthermore, the limit u δ is independent of the choice of a mollification kernel η and agrees with the limiting process u constructed in Theorem 1.1. This is the second meaning of the uniqueness of the limiting process u. Remark 1.10. (i) From the "scaling" point of view, our problem for 0 < α 1 is more difficult than the quadratic SNLW (1.8) considered in [26], where the paracontrolled calculus played an essential role. On the other hand, for the proof of Theorem 1.1, we do not need to use the paracontrolled ansatz for the remainder terms v = u − + thanks to the smoothing on the noise and the use of space-time estimates, which allows us to place v in the subcritical regularity 1 2 +. Our approach to (1.6) and Bringmann's approach in [12] crucially exploit various multilinear smoothing, gaining ∼ 1 2 -derivative. When α = 0 (or β = 0 in the Hartree SNLW (1.9)), such multilinear smoothing seems to give (at best) 1 2 -smoothing and thus the arguments in this paper and in [12] break down in the α = 0 case.
(ii) In [26], Gubinelli, Koch, and the first author studied the quadratic SNLW on T 3 with an additive space-time white noise (i.e. α = 0): With the Wick renormalization and the second order expansion u = − + v, where = I( ), the remainder term v = u − + satisfies As observed in [26], the main issue in studying (1.29) comes from the regularity 1 2 − of v, which is inherited from the regularity − 1 2 − of . As a result, the product v in (1.29) is not well defined since the sum of the regularities of and v is negative. As in (1.21), it is tempting to directly define the random operator I (v) = I( v), using the random matrix estimates. However, there is an issue in handling the "high × high → low" interaction and thus the random matrix approach alone is not sufficient to close the argument. In [26], this issue was overcome by a paracontrolled ansatz and an iteration of the Duhamel formulation. We point out that the use of the paracontrolled ansatz in [26] led to the following paracontrolled operator I< (v) = I(v < ), which avoids the undesirable high × high → low interaction. Instead of the paracontrolled calculus, one may use the random averaging operator from [17] together with an iteration of the Duhamel formulation. We, however, point out that due to the problematic high × high interaction, the random averaging operator as introduced in [17] alone (without iterating the Duhamel formulation) does not seem to be sufficient to study the quadratic SNLW (1.28).
• Organization of the paper. In Section 2, we go over the basic definitions and lemmas from deterministic and stochastic analysis. In Section 3, we first state the almost sure regularity and convergence properties of (the truncated versions of) the stochastic objects in the enhanced data set Ξ in (1.26). Then, we present the proof of our main result (Theorem 1.1). In Section 4, we establish the almost sure regularity and convergence properties of the stochastic objects in the enhanced data set. In Section A, we recall the counting lemmas from [12] which play a crucial role in Section 4. In Sections B and C, we provide the basic definitions and lemmas on multiple stochastic integrals and (random) tensors, respectively.

Notations and basic lemmas
We write A B to denote an estimate of the form A ≤ CB. Similarly, we write A ∼ B to denote A B and B A and use A B when we have A ≤ cB for small c > 0. We also use a+ (and a−) to mean a + ε (and a − ε, respectively) for arbitrarily small ε > 0.
When we work with space-time function spaces, we use short-hand notations such as C T H s x = C([0, T ]; H s (T 3 )). When there is no confusion, we simply use u or F(u) to denote the spatial, temporal, or space-time Fourier transform of u, depending on the context. We also use F x , F t , and F x,t to denote the spatial, temporal, and space-time Fourier transforms, respectively.
We use the following short-hand notation: n ij = n i + n j , etc. For example, n 123 = n 1 + n 2 + n 3 .
2.1. Sobolev spaces and Besov spaces. Let s ∈ R and 1 ≤ p ≤ ∞. We define the L 2 -based Sobolev space H s (T 3 ) by the norm: We also define the L p -based Sobolev space W s,p (T 3 ) by the norm: for any ξ ∈ R 3 . Then, for j ∈ N 0 := N ∪ {0}, we define the Littlewood-Paley projector P j as the Fourier multiplier operator with a symbol φ j . Thanks to (2.1), we have Next, we recall the following paraproduct decomposition due to Bony [6]. See [1,24] for further details. Let f and g be functions on T 3 of regularities s 1 and s 2 , respectively. Using (2.2), we write the product f g as 3) The first term f < g (and the third term f > g) is called the paraproduct of g by f (the paraproduct of f by g, respectively) and it is always well defined as a distribution of regularity min(s 2 , s 1 + s 2 ). On the other hand, the resonant product f = g is well defined in general only if s 1 + s 2 > 0. We briefly recall the basic properties of the Besov spaces B s p,q (T 3 ) defined by the norm: Lemma 2.1. (i) (paraproduct and resonant product estimates) Let s 1 , s 2 ∈ R and 1 ≤ p, p 1 , p 2 , q ≤ ∞ such that 1 The product estimates (2.4), (2.5), and (2.6) follow easily from the definition (2.3) of the paraproduct and the resonant product. See [1,39] for details of the proofs in the non-periodic case (which can be easily extended to the current periodic setting). The embedding (2.7) follows from the q -summability of 2 (s 1 −s 2 )j j∈N 0 for s 1 < s 2 and the uniform boundedness of the Littlewood-Paley projector P j .
We also recall the following product estimate from [25].
Note that while Lemma 2.2 was shown only for s = 3 1 p + 1 q − 1 r in [25], the general case 2.2. Fourier restriction norm method and Strichartz estimates. We first recall the so-called X s,b -spaces, also known as the hyperbolic Sobolev spaces, due to Klainerman-Machedon [34] and Bourgain [7], defined by the norm: ). Given an interval I ⊂ R, we define the local-in-time version X s,b (I) as a restriction norm: (2.9) Next, we recall the Strichartz estimates for the linear wave/Klein-Gordon equation. Given Then, we have the following Strichartz estimates.
When b > 1 2 , the X s,b -spaces enjoy the transference principle. In particular, as a corollary to Lemma 2.3, we obtain the following space-time estimate. See [35,64] for the proof.
We also state the nonhomogeneous linear estimate. See [22].
We conclude this part by establishing the following trilinear estimate, which will be used to control the term v 2 in (1.25). See Proposition 8.6 in [12] for an analogous trilinear estimate.
Proof. By applying the Littlewood-Paley decompositions, we have LHS of (2.15)

3-d CUBIC SNLW WITH ALMOST SPACE-TIME WHITE NOISE
For simplicity of notation, we set N 1 = 2 j 1 , N 23 = 2 j 23 , and N 123 = 2 j 123 , denoting the dyadic frequency sizes of n 1 (for u 1 ), n 23 (for u 2 u 3 ), and n 123 (for u 1 u 2 u 3 ), respectively. We set v k = P j k u k . In view of n 123 = n 1 + n 23 , we separately estimate the contributions from (i) N 123 ∼ max(N 1 , N 23 ) and (ii) N 123 max(N 1 , N 23 ).

2.3.
On discrete convolutions. Next, we recall the following basic lemma on a discrete convolution.
Then, we have (ii) Let d ≥ 1 and α, β ∈ R satisfy α + β > d. Then, we have Namely, in the resonant case (ii), we do not have the restriction α, β < d. Lemma 2.8 follows from elementary computations. See, for example, Lemmas 4.1 and 4.2 in [41] for the proof.
2.4. Tools from stochastic analysis. We conclude this section by recalling useful lemmas from stochastic analysis. See [5,62,43] for basic definitions. See also Appendix B for basic definitions and properties for multiple stochastic integrals.
Let (H, B, µ) be an abstract Wiener space. Namely, µ is a Gaussian measure on a separable Banach space B with H ⊂ B as its Cameron-Martin space. Given a complete orthonormal system {e j } j∈N ⊂ B * of H * = H, we define a polynomial chaos of order k to be an element of the form ∞ j=1 H k j ( x, e j ), where x ∈ B, k j = 0 for only finitely many j's, k = ∞ j=1 k j , H k j is the Hermite polynomial of degree k j , and ·, · = B ·, · B * denotes the B-B * duality pairing. We then denote the closure of polynomial chaoses of order k under L 2 (B, µ) by H k . The elements in H k are called homogeneous Wiener chaoses of order k. We also set Let L = ∆ − x · ∇ be the Ornstein-Uhlenbeck operator. 15 Then, it is known that any element in H k is an eigenfunction of L with eigenvalue −k. Then, as a consequence of the hypercontractivity of the Ornstein-Uhlenbeck semigroup U (t) = e tL due to Nelson [42], we have the following Wiener chaos estimate [63,Theorem I.22]. See also [65,Proposition 2.4]. Lemma 2.9. Let k ∈ N. Then, we have for any p ≥ 2 and any X ∈ H ≤k .
The following lemma will be used in studying regularities of stochastic objects. We say that a stochastic process X : R + → D (T d ) is spatially homogeneous if {X(·, t)} t∈R + and {X(x 0 + · , t)} t∈R + have the same law for any x 0 ∈ T d . Given h ∈ R, we define the difference operator δ h by setting Lemma 2.10. Let {X N } N ∈N and X be spatially homogeneous stochastic processes : , s < s 0 − σ 2 , almost surely, thus converging to some process in C([0, T ]; W s,∞ (T d )). Lemma 2.10 follows from a straightforward application of the Wiener chaos estimate (Lemma 2.9). For the proof, see Proposition 3.6 in [41] and Appendix in [50]. As compared to Proposition 3.6 in [41], we made small adjustments. In studying the time regularity, we made the following modifications: n −d−2s 0 +2σ → n −d−2s 0 +σ and s < s 0 − σ → s < s 0 − σ 2 so that it is suitable for studying the wave equation. Moreover, while the result in [41] is stated in terms of the Besov-Hölder space for any ε > 0. For the proof of the almost sure convergence claims, see [50].
3. Local well-posedness of SNLW, α > 0 In this section, we present the proof of local well-posedness of (1.25) (Theorem 1.1). In Subsection 3.1, we first state the regularity and convergence properties of the stochastic objects in the enhanced data set Ξ in (1.26). In Subsection 3.2, we then present a deterministic local well-posedness result by viewing elements in the enhanced data set as given (deterministic) distributions and a given (deterministic) operator with prescribed regularity properties.
3.1. On the stochastic terms. In this subsection, we state the regularity and convergence properties of the stochastic objects in (1.26) whose proofs are presented in Section 4.
, almost surely. In particular, denoting the limit by (formally given by (1.11)), we have for any ε > 0, almost surely.
) almost surely. In particular, denoting the limit by , we have for any ε > 0, almost surely.
Remark 3.2. (i) As mentioned in Section 1, a parabolic thinking gives regularity 3α − 1 2 − for . Lemma 3.1 (ii) states that, when α > 0 is small, we indeed gain about 1 2 -regularity by exploiting multilinear dispersion as in the quadratic case studied in [26]. We point out that our proof is based on an adaptation of Bringmann's analysis on the corresponding term in the Hartree case [12] and thus the regularities we obtain in Lemma 3.1 (ii) as well as Lemmas 3.3, 3.4, and 3.5 may not be sharp (especially for large α > 0; see, for example, a crude bound (4.9)). They are, however, sufficient for our purpose.
(ii) In this section, we only state almost sure convergence but the same argument also yields convergence in L p (Ω) with an exponential tail estimate (as in [27,48,12]). Our goal is, however, to prove local well-posedness and thus the almost sure convergence suffices for our purpose.
) almost surely. In particular, denoting the limit by , we have for any ε > 0, almost surely.
. In particular, denoting the limit by , we have for any ε > 0, almost surely.
(ii) For any s < α . In particular, denoting the limit by , we have for any ε > 0, almost surely.
Given Banach spaces B 1 and B 2 , we use L(B 1 ; B 2 ) to denote the space of bounded linear operators from B 1 to B 2 . We also set endowed with the norm given by for some small θ > 0.
Lemma 3.5. Let α > 0 and T 0 > 0. Then, given sufficiently small δ 1 , δ 2 > 0, the sequence of the random operators {I N } N ∈N defined in (1.21) is a Cauchy sequence in the class , almost surely. In particular, denoting the limit by I , we have The following trilinear estimate is an immediate consequence of Lemma 2.7.

3.2.
Proof of Theorem 1.1. In this section, we prove the following proposition. Theorem 1.1 then follows from this proposition and Lemmas 3.1 -3.5.
• is a function belonging to X α+ 1 is a function belonging to X α+ 1 • the operator I belongs to the class L in the class Proof. Given α > 0 and s > 1 2 , fix small ε > 0 such that ε < min(α, s − 1 2 ). Given an enhanced data set Ξ as in (3.3), we set is as in (3.2). In the following, we assume that for some K ≥ 1. Given the enhanced data set Ξ in (3.3), define a map Γ Ξ by and for some θ > 0. Similarly, we have (3.7) From Lemma 2.5 and Lemma 2.2 with (3.4), we have (3. An analogous computation yields a difference estimate on Γ Ξ (v 1 ) − Γ Ξ (v 2 ). Therefore, Proposition 3.7 follows from a standard contraction argument.

Regularities of the stochastic terms
In this section, we present the proof of Lemmas 3.1 -3.5, which are basic tools in applying Proposition 3.7 to finally prove Theorem 1.1. In view of the local well-posedness result in [51], we assume that 0 < α ≤ 1 4 in the following. Without loss of generality, we assume that T ≤ 1. The main tools in this section are the counting estimates from [12, Section 4] and the random matrix estimate (see Lemma C.3 below) from [18], which capture the multilinear dispersive effect of the wave equation. For readers' convenience, we collect the relevant counting estimates in Appendix A and the relevant definitions and estimates for random matrices and tensors in Appendix C. We show in details how to reduce the relevant stochastic estimates to some basic counting and (random) matrix/tensor estimates studied in [12,Section 4] and [18].
In the remaining part of this section, we assume 0 < T < T 0 ≤ 1.

4.1.
Basic stochastic terms. We first present the proof of Lemma 3.1.
for any n ∈ Z 3 and N ≥ 1. Also, by the mean value theorem and an interpolation argument as in [26], we have Hence, from Lemma 2.10, we conclude that N ∈ C([0, T ]; W α− 1 2 −ε,∞ (T 3 )) for any ε > 0, almost surely. Moreover, a slight modification of the argument, using Lemma 2.10, yields that { N } N ∈N is almost surely a Cauchy sequence in C([0, T ]; W α− 1 2 −ε,∞ (T 3 )), thus converging to some limit . Since the required modification is exactly the same as in [26], we omit the details here.
Remark 4.1. In the remaining part of this section, we establish uniform (in N ) regularity bounds on the truncated stochastic terms (such as N ) but may omit the convergence part of the argument. Furthermore, as for N N studied in Lemma 3.3, we only establish a uniform (in N ) regularity bound on N N (t) for each fixed 0 < t ≤ T ≤ 1. A slight modification as above yields continuity in time but we omit details.
See Appendix B for the basic definitions and properties of multiple stochastic integrals. In terms of multiple stochastic integrals, we can express (4.3) as where f n,t is defined by f n,t (n 1 , t 1 , n 2 , t 2 , n 3 , for (n 1 , t 1 , n 2 , t 2 , n 3 , t 3 ) ∈ (Z 3 × R) 3 . Then, by Fubini's theorem for multiple stochastic integrals (Lemma B.2), we have where F t denotes the Fourier transform in time. With this notation, it follows from Lemma B.1 that we can write the second moment of the X s,b -norm of A N N 1 ,N 2 ,N 3 , appearing in (4.8) and (4.11), in a concise manner: where fN n,t is given by In the following, for conciseness of the presentation, we express various stochastic objects as multiple stochastic integrals on (Z 3 × R + ) k and carry out analysis. For this purpose, we set and use the following short-hand notation: (4.16) Note, however, that one may also carry out equivalent analysis at the level of multiple Wiener-Ito integrals as in the proof of Lemma 3.1 presented above.
Next, we briefly discuss the proof of Lemma 3.3.

Proof of Lemma 3.3. By the paraproduct decomposition (2.3), we have
In view of Lemma 2.1 with (2.19), the paraproducts N < N and N > N belong to C([0, T ]; W α− 1 2 −ε,∞ (T 3 )) for any ε > 0, almost surely. Hence, it remains to study the resonant product = N := N = N . We only study the regularity of the resonant product for a fixed time since the continuity in time and the convergence follow from a systematic modification. In the following, we show for any n ∈ Z 3 and N ≥ 1. Note the bound (4.17) together with Lemma 2.10 shows that the resonant product = N is smoother and has (spatial) regularity 2α − 1 2 − = (α−) + α − 1 2 − . As in [41], by decomposing = N (n, t) into components in the homogeneous Wiener chaoses H k , k = 2, 4, we have From a slight modification 17 of (4.8) with Lemma A.2, we have for any n ∈ Z 3 and N ≥ 1. Then, from Jensen's inequality (see (B.2)), 18 (4.1), (4.18), and Lemma 2.8, we have where g n,t,t is defined by g n,t,t (z 2 , z 3 ) = (4.20) Note that g n,t,t (z 2 , z 3 ) is symmetric (in z 2 and z 3 ). From Fubini's theorem (Lemma B.2), we have We now apply Lemma B.1 to compute the second moment of(4.21). Then, with κ(n) as in (4.7), it follows from expanding the sine functions in (4.20) in terms of the complex exponentials and switching the order of integration in t and t 1 that Under the condition |n 1 | ∼ |n 123 | and n = n 2 + n 3 , we have |n 1 | |n|. Then, by applying the basic resonant estimate (Lemma A.3) and Lemma 2.8, we obtain N is even smoother and has (spatial) regularity 4α−. Therefore, putting (4.19) and (4.22) together, we obtain the desired bound (4.17).

Quintic stochastic term.
In this subsection, we present the proof of Lemma 3.4 (i) on the quintic stochastic process N defined in (1.22). In view of Lemma 2.5, we prove the following bound; given any ε > 0 and sufficiently small δ 2 > 0, there exists θ > 0 such that for any p ≥ 1 and 0 < T ≤ 1, uniformly in N ∈ N. We start by computing the space-time Fourier transform of N with a time cutoff. As shown in (1.22), the quintic stochastic objects N is a convolution of N in (1.15) and N in (1.14): N (n, t) = n=n 123 +n 45 N (n 123 , t) N (n 45 , t). (4.24) Using Lemma B.2, we can write N and N as multiple stochastic integrals: (4.25) where f n,t,t and g n,t are defined by (4.26) By the product formula (Lemma B.4) to (4.24), we can decompose N into the components in the homogeneous Wiener chaoses H k , k = 1, 3, 5: N ∈ H 1 . By taking the Fourier transforms in time, the relation (4.27) still holds. Then, by using the orthogonality of H 5 , H 3 , and H 1 , we have Hence, it suffices to prove (4.23) for each    where f (5) n,t is defined by (4.28) Let Sym(f (5) n,t ) be the symmetrization of f   Then, by taking the temporal Fourier transform and applying Fubini's theorem (Lemma B.2), we have n,· )(τ )) .
With the symmetrization Sym(f    19 Note that both f n,t,t and gn,t in (4.26) are symmetric in their arguments.
• Case (ii): General septic terms. As we saw in the previous subsections, all other terms in (4.48) come from the contractions of the product of N · N · N . In order to fully describe these terms, we recall the notion of a pairing from [12,Definition 4.30] to describe the structure of the contractions. (iii) P is univalent, i.e. for each 1 ≤ i ≤ J, (i, j) ∈ P for at most one 1 ≤ j ≤ J.
If (i, j) ∈ P, the tuple (i, j) is called a pair. If 1 ≤ j ≤ J is contained in a pair, we say that j is paired. With a slight abuse of notation, we also write j ∈ P if j is paired. If j is not paired, we also say that j is unpaired and write j / ∈ P. Furthermore, given a partition A = {A } L

=1
of {1, · · · , J}, we say that P respects A if i, j ∈ A for some 1 ≤ ≤ L implies that (i, j) / ∈ P. Namely, P does not pair elements of the same set A ∈ A. We say that (n 1 , . . . , n J ) ∈ (Z 3 ) J is admissible if (i, j) ∈ P implies that n i + n j = 0.
In order to represent (k) N (n, t), k = 1, 3, 5, as multiple stochastic integrals as in (4.49), we start with (4.50) and perform a contraction over the variables z j = (n j , t j ), namely, we consider a (non-trivial) 20 pairing on {1, . . . , 7}. Then, by integrating in t and t first in (4.50) 20 Namely, P = ∅. after a contraction, a computation analogous to that in Case (i) yields (k) where K is as in (4.52) and the non-resonant frequency n nr is defined by N discussed in Case (i) is a special case of (4.54) with P = ∅. By applying Lemma A.6 (with (1.23)), we then obtain provided that ε > 0. This concludes the proof of Lemma 3.4 (ii).

Random operator.
In this subsection, we present the proof of Lemma 3.5 on the random operator I N defined in (1.21).
Then, by noting | v(n, τ ) (4.60) With this in mind, we write where ε 0 , ε 3 ∈ {−1, 1} and the kernel H = H ε 0 ,ε 3 is given by By Fubini's theorem (Lemma B.2), we can write H as where h n,n 3 ,τ,τ 3 is given by (4.63) Then, by (4.6), (4.61), Cauchy-Schwarz's inequality, and (4.60), we have , as long as δ, δ 2 > 0, where, in the last step, we used Minkowski's integral inequality followed by Hölder's inequality (in τ and τ 3 ). Here, we viewed H(n, n 3 , τ, τ 3 ) (for fixed τ, τ 3 ∈ R) as an infinite dimensional matrix operator mapping from 2 n 3 into 2 n . Hence, the estimate (4.59) is reduced to proving As mentioned above, we instead establish a frequency-localized version of (4.64): for some small δ 0 > 0, uniformly in dyadic N 1 , N 2 , N 3 ≥ 1, where N max = max(N 1 , N 2 , N 3 ) and H N 1 ,N 2 ,N 3 is defined by (4.62) and (4.63) with extra frequency localizations 1 |n j |∼N j , j = 1, 2, 3. Namely, we have where h N 1 ,N 2 ,N 3 n,n 3 ,τ,τ 3 is given by where H m n 3 ,τ,τ 3 is given by Performing t-integration, we have  for any ε > 0, provided that δ 1 < α, which is needed to apply Lemma C.2. Hence, by noting that the condition |κ(n) − m| ≤ 1 implies |m| N max and summing over m ∈ Z, the bound (4.65) follows from (4.69) and (4.71) (by taking ε > 0 sufficiently small), which in turn implies Namely, the frequencies n 1 , n 2 , and n 3 are localized to the dyadic blocks {|n j | ∼ N j }, j = 1, 2, 3. On the other hand, a crude bound shows for some (possibly large) K > 0. By interpolating (4.72) and (4.73) and then summing over dyadic N j , j = 1, . . . , 3, we obtain (4.58) for some small δ 2 > 0. Lastly, as for the convergence of I N to I , we can simply repeat the computation above to estimate the difference In considering the difference of the tensors h m in (4.68), we then obtain a new restriction max(|n 1 |, |n 2 |) N , which allows us to gain a small negative power of N . As a result, we obtain for some small ε, δ 0 > 0, Then, interpolating this with (4.73) and summing over dyadic blocks, we then obtain for any p ≥ 1 and M ≥ N ≥ 1. Then, by applying Chebyshev's inequality, summing over N ∈ N, and applying the Borel-Cantelli lemma, we conclude the almost sure convergence of I N . This concludes the proof of Lemma 3.5.

Appendix A. Counting estimates
In this section, we state the counting estimates used in Section 4 to study the regularities of the stochastic terms. These lemmas are taken from Bringmann [12]. Note that some statements are given in a slightly simplified form. The same comment applies to Lemma C.2.
Next, we recall the basic resonant estimate.
The next two lemmas (and Lemma A.3 above) are used for estimating the quintic stochastic term.
Lemma A.4 is essentially Lemma 4.27 in [12], where the condition |κ 4 (n) − m | ≤ 1 in (A.1) is replaced by |κ 4 (n) + ε 123 n 123 − m | ≤ 1. We point out that this modification does not make any difference in the proof. In our notation, the first step of the proof of Lemma 4.27 in [12] is to sum over n 5 , using [12,Lemma 4.17], for which the conditions |κ 4 (n) − m | ≤ 1 in (A.1) and |κ 4 (n) + ε 123 n 123 − m | ≤ 1 do not make any difference since the extra term ε 123 n 123 is fixed in summing over n 5 .

Appendix B. Multiple stochastic integrals
In this section, we go over the basic definitions and properties of multiple stochastic integrals. See [43] and also [12,Section 4] for further discussion.
Let λ be the measure on Z := Z 3 × R + defined by where dn is the counting measure on Z 3 . Given k ∈ N, we set λ k = k j=1 λ and L 2 (Z k ) = L 2 ((Z 3 × R + ) k , λ k ). Given a function f ∈ L 2 (Z k ), we can adapt the discussion in [ Given a function f ∈ L 2 (Z k ), we define its symmetrization Sym(f ) by where z j = (n j , t j ) as in (4.15) and S k denotes the symmetric group on {1, . . . , k}. Note that by Jensen's equality, we have for any p ≥ 1. We say that f is symmetric if Sym(f ) = f . We now recall some basic properties of multiple stochastic integrals.
Lemma B.1. Let k, ∈ N. The following statements hold for any f ∈ L 2 (Z k ) and g ∈ L 2 (Z ): (i) I k : L 2 (Z k ) → H k ⊂ L 2 (Ω) is a linear operator, where H k denotes the kth Wiener chaos.
(iii) Ito isometry: (iv) Furthermore, suppose that f is symmetric. Then, we have where the iterated integral on the right-hand side is understood as an iterated Ito integral.
We state a version of Fubini's theorem for multiple stochastic integrals that is convenient for our purpose. See, for example, [15,Theorem 4.33] for a version of the stochastic Fubini theorem.
Proof. From Lemma B.1 (ii), we may assume that f (z 1 , . . . , z k , t) is symmetric in z j = (n j , t j ), j = 1, . . . , k. Let n n n = (n 1 , . . . , n k ) and t t t = (t 1 , . . . , t k ). From Minkowski's integral inequality, Lemma B.1 (iii), and Cauchy-Schwarz's inequality, we have On the other hand, by Lemma B.1 (iii) and Cauchy-Schwarz's inequality, we have Hence, it follows from (B.4), (B.5) and the density 21 of 2 n n n (( ) that we may assume that f is symmetric and belongs to 2 n n n ((Z 3 ) k ; C ∞ t,t t t ([0, T ] k+1 )). Furthermore, we may assume that f has a compact support in n n n. Namely, there exists K > 0 such that if max(|n 1 |, . . . , |n k |) > K, then f (n 1 , t 1 , . . . , n k , t k , t) = 0 for any t 1 , . . . , t k , t ∈ [0, T ]. Then, together with Lemma B.1 (iv), we have since the summation is over a finite set of indices n n n = (n 1 , . . . , n k ) and f is symmetric. Hence, it remains to justifying the t-integration with the stochastic integrals for each fixed n n n = (n 1 , . . . , n k ). For this reason, we suppress the dependence of f on n n n = (n 1 , . . . , n k ) in the following. 21 By identifying a function f ∈ 2 n n n ((Z 3 ) k ; L 2 t,t t t ([0, T ] k+1 )) with a sequence {fn n n} n n n∈(Z 3 ) k ⊂ L 2 t,t t t ([0, T ] k+1 ), we can approximate each fn n n by a smooth function ϕn n n such that fn n n − ϕn n n L 2 t,t t t ([0,T ] k+1 ) < εn n n such that εn n n is symmetric in n n n and n n n∈(Z 3 ) k εn n n = ε. Then, the function ϕ ∼ = {ϕn n n} n n n∈(Z 3 ) k approximates f within distance ε in 2 n n n ((Z 3 ) k ; L 2 t,t t t ([0, T ] k+1 )). Since f is symmetric, we can choose ϕ to be symmetric.
When k = 1, we can exploit the smoothness of f and have where, at the second equality, we used the standard Fubini's theorem in view of the almost sure boundedness of B n 1 on [0, T ]. This proves (B.3) when k = 1.
For the general case, let us first consider the innermost integral in (B.6). For notational simplicity, let us suppress all the variables of f except for t k and t. Let ∆ m = {0 ≤ τ 0 < τ 1 < · · · < τ m ≤ T } be a partition of [0, T ] and define a step function f m (·, t) by setting it follows from the definition of the Wiener integral that By the definition of the Wiener integral once again, we have RHS of (B.9) −→ while from Minkowski's integral inequality, (B.8), and the bounded convergence theorem (recall that f is smooth), we have as m → ∞. Hence, from (B.9), (B.10), and (B.11), we conclude that Next, we consider Given the partition ∆ m of [0, T ] as above, we define an adaptive step function F m (·, t) by setting F m (τ, t; ω) = F (τ j−1 , t; ω) for τ j−1 < τ ≤ τ j . Then, we can simply repeat the previous computation (but with Ito integrals instead of Wiener integrals) and obtain in L 2 (Ω). Combining (B.13) and (B.14) with (B.12), we then obtain in L 2 (Ω). By iterating this process, we conclude in L 2 (Ω). Together with (B.6), this proves (B.3).
Note that even if f and g are symmetric, their contraction f ⊗ r g is not symmetric in general. We now state the product formula. See [43,Proposition 1.1.3].
Lemma B.4 (product formula). Let k, ∈ N. Let f ∈ L 2 (Z k ) and g ∈ L 2 (Z ) be symmetric functions. Then, we have Appendix C. Random tensors In this section, we provide the basic definition and some lemmas on (random) tensors from [18,12]. See [18, Sections 2 and 4] and [12, Section 4] for further discussion.
Definition C.1. Let A be a finite index set. We denote by n A the tuple (n j : j ∈ A). A tensor h = h n A is a function: (Z 3 ) A → C with the input variables n A . Note that the tensor h may also depend on ω ∈ Ω. The support of a tensor h is the set of n A such that h n A = 0.
Given a finite index set A, let (B, C) be a partition of A. We define the norms · n A and · n B →n C by where we used the short-hand notation n Z for n Z ∈(Z 3 ) Z for a finite index set Z. Note that, by duality, we have h n B →n C = h n C →n B = h n B →n C for any tensor h = h n A . If B = ∅ or C = ∅, then we have h n B →n C = h n A .
For example, when A = {1, 2}, the norm h n 1 →n 2 denotes the usual operator norm h 2 n 1 → 2 n 2 for an infinite dimensional matrix operator {h n 1 n 2 } n 1 ,n 2 ∈Z 3 . By bounding the matrix operator norm by the Hilbert-Schmidt norm (= the Frobenius norm), we have Next, we recall a key deterministic tensor bound in the study of the random cubic NLW from [12].
We conclude this section with the following random matrix estimate. This lemma is essentially Propositions 2.8 and 4.14 in [18]; see also Proposition 4.50 in [12]. In our stochastic PDE setting, however, we need a slightly different formulation (in particular, adapted to multiple stochastic integrals with general integrands) and thus for readers' convenience, we present its proof.
Let A be a finite index set. As in (4.15) and (4.16), we set z A = (k A , t A ) for (k A , t A ) ∈ (Z 3 ) A × R A and write f z A = f (z A ) = f (n A , t A ). Lemma C.3. Let A be a finite index set with k = |A| ≥ 1. Let h = h bcn A be a tensor such that n j ∈ Z 3 for each j ∈ A and (b, c) ∈ (Z 3 ) d for some integer d ≥ 2. Given N ≥ 1, assume that supp h ⊂ |b|, |c|, |n j | N for each j ∈ A . (C.4) Given a (deterministic) tensor h bcn A ∈ 2 bcn A , define the tensor H = H bc by , where I k denotes the multiple stochastic integral defined in Appendix B. Then, for any θ > 0, we have where the maximum is taken over all partitions (B, C) of A.
Remark C.4. (i) The assumption that h bcn A ∈ 2 bcn A and f ∈ ∞ n A ((Z 3 ) A ; L 2 t A (R A + )) ensures that the multiple stochastic integral I k h bcn A f z A in (C.5) is well defined. Note that if for instance we have a stronger condition f ∈ 2 (Z 3 ) A ; L 2 (R A + ) , then the conclusion (C.6) trivially holds without any loss in N . We also note that even if the tensor h is random, Lemma C.3 holds with the same proof as long as h is independent of the Brownian motions {B n A } defining multiple stochastic integrals.
Proof of Lemma C.3. We follow the proof of Proposition 4.14 in [18] and use a higher order version of Bourgain's T T * -argument [9]. Let T : 2 c → 2 b be the linear operator whose kernel is H bc . Namely, T is defined by For j ∈ N, we define the operator T j by T j = (T T * ) m if j = 2m, and T j = (T T * ) m T if j = 2m + 1. We claim that T j has a kernel which is given by a linear combination of terms T j of the form T j = I y bb (z D ) , when j is even, I y bc (z D ) , when j is odd, (C.8) for some finite index set D and = |D| ≤ kj, where y bb (z D ) (or y bc (z D )) satisfies the following bound: where the maximum is taken over all partitions (B, C) of A. Here, the implicit constant depends on k, , and j. While it grows with j (and ), this does not cause an issue since for a given small θ > 0 in (C.6), we fix j = j(θ) 1.
Let j = 1. In this case, comparing (C.8) with (C.7) and (C.5) and using Lemma B.1 (ii), we have y bc (z D ) = Sym(h bcn A f (z A )) with D = A and thus the bound (C.9) follows from Hölder's inequality. Note that, in this case, it follows from Lemma B.1 (iii) that where the right-hand side is the second moment of the Hilbert-Schmidt norm of the operator T . By taking higher powers T j , we control the operator norm of T . Now, assume that the claim with (C.8) and (C.9) hold true for j − 1. We assume that j is odd. The proof for even j is analogous. Noting that T j = T j−1 T , it follows from the inductive hypothesis (C.8) with (C.5) and Lemma B.1 (ii) that the kernel for T j is given by a linear combination of terms T j of the form Hence, it suffices to show that b (Sym(y bb ) ⊗ r Sym(h b c f )) satisfies (C.9) for each 0 ≤ r ≤ min(k, ). For notational simplicity, we drop Sym in Sym(y bb ) and Sym(h b c f ) in the following. Note that this does not cause any issue since, in taking the L 2 (Ω)-norm, we can remove Sym by Jensen's inequality (B.2) as in Section 4. Fix 0 ≤ r ≤ min(k, ). From Definition B.3 on the contraction, we have where z C = (−n C , t C ) for given z C = (n C , t C ). Here, B 1 , B 2 , and C are pairwise disjoint sets such that |B 1 | = − r, |B 2 | = k − r, |C| = r, B = B 1 ∪ B 2 , and (by suitable relabeling of indices) Then, from (C.10), Cauchy-Schwarz's inequality (in t C ), Minkowski's integral inequality (with , (C.1), and the identification in (C.11), we have (C.12) Moreover, from (C.1), we have where the maximum is taken over all partitions (A 1 , A 2 ) of A. Hence, from (C.12), (C.13), and the inductive hypothesis (C.9) (with j − 1 in place of j), we obtain (C.9) for j. Therefore, by induction, the claim holds for any j ∈ N.