1 Introduction

1.1 Motivation from Weak Universalities

The study of singular stochastic PDEs has received much attention recently, and powerful theories are being developed to enhance the general understanding of this area. We refer to the excellent surveys [8, 11] and references therein for recent breakthroughs in the field.

One of the motivations to study singular SPDEs is that many of them are expected to be universal objects in crossover regimes of their respective universality classes, a phenomenon known as weak universality. One well-known example is the KPZ equation [14], formally given by

$$\begin{aligned} \partial _t h = \partial _x^2 h + \lambda (\partial _x h)^2 + \xi , \end{aligned}$$

where \(\xi \) is the one-dimensional space-time white noise. The equation is only formal since it involves the square of a distribution. Nevertheless, the solution can be rigorously constructed in a few different ways, including the Cole–Hopf transform [1], pathwise solutions via rough paths/regularity structures [3, 9, 10] or paracontrolled distributions [6], or the notion of energy solution through a martingale problem [4, 7].

The KPZ equation is expected to be the universal model for weakly asymmetric interface growth at large scales. In [12], the authors considered continuous microscopic models of the type

$$\begin{aligned} \partial _t \tilde{h} = \partial _x^2 \tilde{h} + \sqrt{\varepsilon } F(\partial _x \tilde{h}) + \tilde{\xi } \end{aligned}$$
(1.1)

for any even polynomial F and smooth stationary Gaussian random field \(\tilde{\xi }\). The main result in [12] is that there exists \(C_{\varepsilon } \rightarrow +\infty \) such that the rescaled and re-centered height function

$$\begin{aligned} h_{\varepsilon }(t,x) := \varepsilon ^{\frac{1}{2}} \tilde{h}(t/\varepsilon ^2, x/\varepsilon ) - C_{\varepsilon }t \end{aligned}$$

converges to the solution of the KPZ equation with

$$\begin{aligned} \lambda = \frac{1}{2} \mathbf {E}F''(\tilde{\varPsi }), \end{aligned}$$
(1.2)

where \(\tilde{\varPsi } = \partial _x P * \tilde{\xi }\) and P is the heat kernel. Hairer and Xu [13] extended the result to arbitrary even functions F with sufficient regularity and polynomial growth. Similar results have also been obtained in [5] for models at stationarity.

To see why the convergence holds with \(\lambda \) given by (1.2), we write down the equation for \(h_{\varepsilon }\):

$$\begin{aligned} \partial _t h_{\varepsilon } = \partial _x^2 h_{\varepsilon } + \varepsilon ^{-1} F\left( \sqrt{\varepsilon } \partial _x h_\varepsilon \right) + \xi _{\varepsilon } - C_{\varepsilon }, \end{aligned}$$

where \(\xi _{\varepsilon }(t,x) = \varepsilon ^{-\frac{3}{2}} \tilde{\xi }(t/\varepsilon ^2, x/\varepsilon )\) approximates the space-time white noise \(\xi \) at scale \(\varepsilon \). Let \(\varPsi _\varepsilon = \partial _x P * \xi _\varepsilon \), then \(\sqrt{\varepsilon } \varPsi _\varepsilon \) is stationary Gaussian with finite variance. In addition, by analogy with the standard KPZ equation, it is reasonable to expect that the remainder \(\partial _x u_\varepsilon =\partial _x h_\varepsilon - \varPsi _\varepsilon \) is almost bounded. Hence, one can Taylor expand the nonlinearity \(\varepsilon ^{-1} F(\sqrt{\varepsilon } \partial _x h_\varepsilon )\) around \(\sqrt{\varepsilon } \varPsi _\varepsilon \) and formally get

$$\begin{aligned} \begin{aligned} \varepsilon ^{-1}F(\sqrt{\varepsilon } \partial _x h_\varepsilon ) - C_\varepsilon&=\left( \varepsilon ^{-1}F(\sqrt{\varepsilon } \varPsi _\varepsilon ) - C_{\varepsilon } \right) + \varepsilon ^{-\frac{1}{2}} F'(\sqrt{\varepsilon } \varPsi _\varepsilon ) \cdot (\partial _x u_\varepsilon )\\&\quad + F''(\sqrt{\varepsilon } \varPsi _\varepsilon ) \cdot (\partial _x u_\varepsilon )^{2} + \mathcal {O}(\varepsilon ^{\frac{1}{2}-}). \end{aligned} \end{aligned}$$

One then needs to show the convergence of the objects \(\varepsilon ^{-1} F(\sqrt{\varepsilon } \varPsi _\varepsilon ) - C_\varepsilon \), \(\varepsilon ^{-\frac{1}{2}} F'(\sqrt{\varepsilon } \varPsi _\varepsilon )\), \(F''(\sqrt{\varepsilon } \varPsi _\varepsilon )\) as well as their products, which arise from the local expansion of \(\partial _x u_\varepsilon \). Furthermore, such convergences need to be established in the optimal regularity space, which requires one to pth obtain moment bounds of these stochastic objects for arbitrarily large pth.

At least formally, by chaos expanding \(\varepsilon ^{-1}F(\sqrt{\varepsilon } \varPsi _\varepsilon )\) and taking \(C_\varepsilon = \varepsilon ^{-1} \mathbf {E}F(\sqrt{\varepsilon } \varPsi _\varepsilon )\), one can see that

$$\begin{aligned} \varepsilon ^{-1} F(\sqrt{\varepsilon } \varPsi _\varepsilon ) - C_{\varepsilon } \rightarrow \lambda \varPsi ^{\diamond 2}, \end{aligned}$$
(1.3)

where \(\lambda \) is given in (1.2) and \(\varPsi = \partial _x P * \xi \) is the limit of \(\varPsi _\varepsilon \). This is because all terms starting from the 4-th chaos vanish termwise as \(\varepsilon \rightarrow 0\), and only the second chaos component survives in the limit.

When F is even polynomial, this heuristic indeed gives a direct proof of the convergence of the term in (1.3). However, when F is not polynomial, the actual proof of the convergence becomes much subtler. The main obstacle is that \(F(\sqrt{\varepsilon } \varPsi _\varepsilon )\) expands into an infinite chaos series. If we brutally control their high moments termwise as in the polynomial case, then in order for these termwise moment bounds to be summable, we need to impose very strong conditions on F (namely, its Fourier transform being compactly supported), which is clearly too restrictive.

Instead, in [13], the authors expanded \(F(\sqrt{\varepsilon } \varPsi _\varepsilon )\) in terms of Fourier transform, developed a procedure in obtaining pointwise correlation bounds on trigonometric functions of Gaussians, and deduced the desired convergence from those bounds.

Similar universality results are also present in the dynamical \(\varPhi ^4_3\) model. The weak universality of \(\varPhi ^4_3\) equation for a large class of symmetric phase coexistence models with polynomial potential was established in [13] for Gaussian noise and then extended in [17] to non-Gaussian noise. The extension beyond polynomial potential (even with Gaussian noise) has the same difficulties as in the KPZ case discussed above. In the recent work [2], the authors developed different methods based on Malliavin calculus to control similar objects. The methods developed in [2, 13] to treat general nonlinearities are both robust enough to cover both KPZ and \(\varPhi ^4_3\) equations as well as other similar situations.

In this article, we follow the ideas developed in [13] and prove a uniform bound in a special case considered in there. This special case is technically simpler to explain, but is also illustrative enough to reveal the main idea of the proof for the more general case. Furthermore, we obtain a better bound in this special case, thus yielding convergence results for functions F with lower regularity.

1.2 Main Statements

Fix a scaling \(\mathfrak {s}= (s_1, \dots , s_d)\) on \(\mathbf {R}^d\), and let \(|\mathfrak {s}| = \sum _{j} s_j\). The metric induced by \(\mathfrak {s}\) is

$$\begin{aligned} |x|_{\mathfrak {s}} := |x_1|^{\frac{1}{s_1}} + \dots + |x_d|^{\frac{1}{s_d}}. \end{aligned}$$

Since the scaling is fixed throughout the article, we simply write |x| instead of \(|x|_{\mathfrak {s}}\). For any Gaussian random field X, any function \(F: \mathbf {R}\rightarrow \mathbf {R}\) with at most exponential growth, and any integer \(m \ge 0\), we write

$$\begin{aligned} \mathcal {H}_{m}\left( F(X)\right) = \sum _{n \ge m} C_{n} X^{\diamond n}, \end{aligned}$$

where \(X^{\diamond n}\) denotes the n-th Wick power of X, and \(C_{n} = \frac{1}{n!} \mathbf {E}F^{(n)}(X)\) is the coefficient of \(X^{\diamond n}\) in the chaos expansion of F(X). In other words, \(\mathcal {H}_{m}(F(X))\) is F(X) with the first \(m-1\) chaos removed. We refer to [16, Chapter 1] for more details on chaos expansion of random variables. We have the following bound.

Theorem 1.1

Let \(\alpha \in (0,|\mathfrak {s}|)\), and \(\{\varPhi _\varepsilon \}_{\varepsilon \in (0,1)}\) be a class of centered Gaussian random fields satisfying

$$\begin{aligned} \frac{\varepsilon ^{\alpha }}{\varLambda \left( |x-y|+\varepsilon \right) ^{\alpha }} \le \mathbf {E}\left( \varPhi _\varepsilon (x) \varPhi _\varepsilon (y) \right) \le \frac{\varLambda \varepsilon ^{\alpha }}{\left( |x-y|+\varepsilon \right) ^{\alpha }} \end{aligned}$$
(1.4)

for some \(\varLambda > 1\) and for all \(x,y \in \mathbf {R}^d\) and all \(\varepsilon \in (0,1)\). Then, for every \(K \ge 1\) and \(m, r \in \mathbf {N}\), there exists \(C>0\) depending on these parameters and \(\varLambda \) only such that

$$\begin{aligned} \left| \mathbf {E}\prod _{j=1}^{K} \partial _{\theta }^{r} \mathcal {H}_{m}\left( \mathrm{e}^{i \theta \varPhi _\varepsilon (x_j)} \right) \right| \le C \mathbf {E}\prod _{j=1}^{K} \left( \varPhi _{\varepsilon }^{\diamond m}(x_j) + \varPhi _{\varepsilon }^{\diamond (m+1)}(x_j) \right) \end{aligned}$$
(1.5)

for all \(\varepsilon \in (0,1)\), \(\theta \in \mathbf {R}\) and \(\mathbf {x}= (x_k)_{k=1}^{K}\).

Theorem 1.1 is the main technical ingredient to establish that if \(\{\varPsi _\varepsilon \}\) approximates a certain Gaussian random field \(\varPsi \), then a large class of nonlinear functions of \(\varPsi _\varepsilon \), after proper rescaling and re-centering, converges to certain Wick powers of \(\varPsi \). We first give the assumption on the random field \(\varPsi \).

Assumption 1.2

\(\varPsi \) is a stationary Gaussian random field with correlationFootnote 1

$$\begin{aligned} \mathbf {E}\left( \varPsi (x) \varPsi (y) \right) = G(x-y), \end{aligned}$$

where G satisfies the bounds

$$\begin{aligned} \frac{c}{|x|^\alpha } \le G(x) \le \frac{C}{|x|^\alpha }\ \quad \text {and} \quad |(\partial _j G)(x)| \le C |x|^{-\alpha -s_j} \end{aligned}$$

for some \(\alpha \in (0,|\mathfrak {s}|)\) and all \(x \in \mathbf {R}^d\). In addition, there exists a locally integrable function g such that

$$\begin{aligned} \varepsilon ^{\alpha } G\left( \varepsilon ^{s_1} x_1, \dots , \varepsilon ^{s_d}x_d\right) \rightarrow g(x) \end{aligned}$$
(1.6)

in \(L^{1}(\varOmega )\) for every bounded subset \(\varOmega \) of \(\mathbf {R}^d\).

For \(M \in \mathbf {N}\) and open subset \(\mathcal {I}\subset \mathbf {R}\), we define the norm \(\Vert \cdot \Vert _{M,\mathcal {I}}\) on distributions on \(\mathbf {R}\) by

$$\begin{aligned} \Vert \Upsilon \Vert _{M,\mathcal {I}} := \sup _{0 \le r \le M} \sup _{{\mathop {\Vert \varphi \Vert _{\mathcal {C}^{M}(\mathcal {I})} \le 1}\limits ^{\varphi \in \mathcal {C}_{c}^{M}(\mathcal {I}):}}} |\langle \Upsilon , \varphi ^{(r)}\rangle |. \end{aligned}$$

Our assumption on the function \(F: \mathbf {R}\rightarrow \mathbf {R}\) is the following.

Assumption 1.3

There exists \(M \in \mathbf {N}\) such that the Fourier transform of F satisfies

$$\begin{aligned} \sum _{k \in \mathbf {Z}} \Vert \widehat{F}\Vert _{M,\mathcal {I}_k} < +\infty , \end{aligned}$$

where \(\mathcal {I}_k = (k-1,k+1)\).

For every \(\rho \in \mathcal {C}_{c}^{\infty }(\mathbf {R}^d)\) and \(\varepsilon >0\), let

$$\begin{aligned} \rho _{\varepsilon }(x) = \varepsilon ^{-|\mathfrak {s}|} \rho \left( x_{1}/\varepsilon ^{s_1}, \dots , x_{d}/\varepsilon ^{s_d}\right) . \end{aligned}$$

The main convergence theorem is the following.

Theorem 1.4

Let \(\varPsi \) and F satisfy the above assumptions, and g be the limiting \(L^1\) function of \(\varepsilon ^\alpha G(\varepsilon \cdot )\) as in Assumption 1.2. Let \(\rho \) be a mollifier on \(\mathbf {R}^d\) and \(\varPsi _{\varepsilon } = \varPsi * \rho _{\varepsilon }\). For every integer m, define

$$\begin{aligned} a_{m} := \frac{1}{m!} \left( F^{(m)}*\mu \right) (0), \end{aligned}$$
(1.7)

where \(\mu \sim \mathcal {N}(0,\sigma ^2)\) is a Gaussian measure on \(\mathbf {R}\) with variance

$$\begin{aligned} \sigma ^2 = \int g(x-y) \rho (x) \rho (y) \mathrm{d}x \mathrm{d}y. \end{aligned}$$
(1.8)

Then, for every \(m < \frac{|\mathfrak {s}|}{\alpha }\) and every sufficiently small \(\kappa \), we have

$$\begin{aligned} \varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m}\left( F(\varepsilon ^{\frac{\alpha }{2}} \varPsi _{\varepsilon })\right) \rightarrow a_{m} \varPsi ^{\diamond m} \end{aligned}$$

as \(\varepsilon \rightarrow 0\) almost surely in \(\mathcal {C}^{-\frac{m\alpha }{2}-\kappa }\). Here, \(\varPsi ^{\diamond m}\) is the m-th Wick power of \(\varPsi \).

We will first prove the main bound (1.5) in Theorem 1.1 and then establish the convergence in Theorem1.4 by Fourier expanding F and applying (1.5) to \(\varPhi _\varepsilon = \varepsilon ^{\frac{\alpha }{2}} \varPsi _\varepsilon \). Note that although the bound in Theorem 1.1 holds for every integer m, the convergence in Theorem 1.4 requires \(m<\frac{|\mathfrak {s}|}{\alpha }\). This can be easily seen from the fact that if \(m \ge \frac{|\mathfrak {s}|}{\alpha }\), then \(\varPsi ^{\diamond m}\) would have divergent covariance and hence not well defined.

Example 1.5

One typical example of the random field \(\varPsi \) satisfying Assumption 1.2 is \(\varPsi = \mathcal {L}^{-\frac{\beta }{2}} \xi \), where \(\xi \) is the white noise on \(\mathbf {R}^d\), \(\beta = \frac{1}{2}(|\mathfrak {s}|-\alpha )\), and \(\mathcal {L}\) is the differential operator given by

$$\begin{aligned} \mathcal {L}= \sum _{j=1}^{d} \left( -\partial _j^2\right) ^{\frac{1}{s_j}}. \end{aligned}$$

This is the fractional Gaussian field. In this case, the convolution kernel K of \(\mathcal {L}^{-\frac{\beta }{2}}\) is homogeneous (in the scaling \(\mathfrak {s}\)) of order \(-|\mathfrak {s}|+\beta \) in the sense that

$$\begin{aligned} K(\lambda ^{s_1}x_1, \dots , \lambda ^{s_d}x_d) = \lambda ^{-|\mathfrak {s}|+\beta } K(x_1, \dots , x_d) \end{aligned}$$

for all \(\lambda > 0\). Hence, for gG as in Assumption 1.2, we have the expression

$$\begin{aligned} g(x) = G(x) = \int _{\mathbf {R}^d} K(x+y) K(y) \mathrm{d}y. \end{aligned}$$

The same is true when \(\mathbf {R}^d\) is given the parabolic scaling \((2, 1, \dots , 1)\) and \(\mathcal {L}= \partial _t - \varDelta \) is the heat operator.

In the case of standard Euclidean scaling \(\mathfrak {s}= (1, \dots , 1)\), \(\varPsi \) is simply the standard fractional Gaussian field \((-\varDelta )^{-\frac{\beta }{2}} \xi \). We refer to the survey [15] for more details on fractional Gaussian fields.

Example 1.6

As for the function F, all \(\mathcal {C}^{1+}\) functions with polynomial growth fall in Assumption 1.3. More precisely, if \(f \in \mathcal {C}^{1,\beta }(\mathbf {R})\) for some \(\beta >0\), and there exist \(C, M>0\) such that

$$\begin{aligned} |f(x)| + |f'(x)| + \sup _{|h|<1} \frac{|f'(x+h)-f'(x)|}{|h|^{\beta }} \le C (1+|x|)^{M} \end{aligned}$$

for all \(x \in \mathbf {R}\), then F satisfies Assumption 1.3.

Example 1.7

One very interesting example of the microscopic model (1.1) is with \(F(x)=|x|\). It is almost linear, but one still expects its large-scale behavior to be nonlinear as described by the KPZ equation. This function F even not \(\mathcal {C}^1\), but we still have

$$\begin{aligned} \Vert \widehat{F}\Vert _{1,\mathcal {I}_k} \le C (1+|k|)^{-2}, \end{aligned}$$

which clearly satisfies Assumption 1.3. Hence, as a consequence of Theorem 1.1, if \(\varPsi = \partial _x P * \xi \) where \(\xi \) is the space-time white noise with one space dimension, and P is the heat kernel, then we have

$$\begin{aligned} \varepsilon ^{-\frac{1}{2}} \left( |\varPsi _\varepsilon | - \mathbf {E}|\varPsi _\varepsilon | \right) \rightarrow a \varPsi ^{\diamond 2} \end{aligned}$$

for some \(a>0\) depending on the mollifier. This is the first step toward establishing convergence to the KPZ equation for the microscopic model of the form (1.1) with \(F(x)=|x|\).

1.3 Remarks and Possible Generalizations

Theorem 1.1 is a special case of [13, Theorem 6.4] in that it allows only one frequency variable \(\theta \) rather than multiple ones. On the other hand, it is also more general since it allows subtraction of Wiener chaos up to any order. Furthermore, the bound (1.5) is completely independent of \(\theta \), while the corresponding one in [13] is polynomial in \(\theta \). As a consequence of this improvement, the condition on F for the convergence in Theorem 1.4 to hold is weaker.

The main technical difference that results in this improvement, as we shall see later in Sect. 2, is that in the clustering procedure, we are able to take the clustering distance L being independent of \(\theta \) rather than being quadratic in \(\theta \) as in [13].

We shall note that the convergence results in this article are not sufficient to establish weak universality in general situations. These would require convergence of the products of the objects considered in Theorem 1.4, with possible subtraction of extra chaos components after taking product. The convergence of these products requires a more general bound than Theorem 1.1 and [13, Theorem 6.2]. We leave them to future work.

2 Proof of Theorem 1.1

This section is devoted to the proof of Theorem 1.1. Assumptions 1.2 and 1.3 on \(\varPsi \) and F are irrelevant here. We fix \(\alpha \in (0,|\mathfrak {s}|)\), and let \(\{\varPhi _{\varepsilon }\}_{\varepsilon \in (0,1)}\) be a family of Gaussian random fields with correlation functions satisfying (1.4). The following preliminary bounds on the correlation function will be used throughout the section.

Proposition 2.1

Let \(\gamma \ge 1\). If \(|x'-y'| \le \gamma |x-y|\), then

$$\begin{aligned} \mathbf {E}\left( \varPhi _{\varepsilon }(x) \varPhi _{\varepsilon }(y) \right) \le \gamma ^{\alpha } \varLambda ^2 \mathbf {E}\left( \varPhi _{\varepsilon }(x') \varPhi _{\varepsilon }(y') \right) . \end{aligned}$$
(2.1)

The bound is uniform over all \(\varepsilon \in (0,1)\) and all pairs of points \((x,y), (x',y') \in (\mathbf {R}^d)^{2}\) satisfying the above constraint. As a consequence, we have

$$\begin{aligned} \mathbf {E}\left( \varPhi _{\varepsilon }(x)\varPhi _{\varepsilon }(y)\right) \; \mathbf {E}\left( \varPhi _{\varepsilon }(x)\varPhi _{\varepsilon }(z)\right) \le \frac{2^\alpha \varLambda ^3 \varepsilon ^{\alpha } }{\left( (|x-y| \wedge |x-z|)+\varepsilon \right) ^{\alpha }} \cdot \mathbf {E}\left( \varPhi _{\varepsilon }(y)\varPhi _{\varepsilon }(z)\right) \end{aligned}$$
(2.2)

for all \(\varepsilon \in (0,1)\) and all \(x,y,z \in \mathbf {R}^d\).

Proof

The first bound follows from

$$\begin{aligned} \mathbf {E}\left( \varPhi _{\varepsilon }(x') \varPhi _{\varepsilon }(y') \right)\ge & {} \frac{\varepsilon ^\alpha }{\varLambda \left( |x'-y'|+\varepsilon \right) ^{\alpha }} \ge \frac{1}{\varLambda ^2 \gamma ^\alpha } \cdot \frac{\varLambda \varepsilon ^\alpha }{\left( |x-y|+\varepsilon \right) ^{\alpha }}\\\ge & {} \frac{\mathbf {E}\left( \varPhi _{\varepsilon }(x)\varPhi _{\varepsilon }(y)\right) }{\varLambda ^2 \gamma ^\alpha }, \end{aligned}$$

where we have used the assumption \(\gamma \ge 1\) in the second inequality. As for the second one, it suffices to notice

$$\begin{aligned} |y-z| \le 2 \max \{|x-y|, |x-z|\} \end{aligned}$$

and then apply (2.1). \(\square \)

In what follows, we keep our notations same as in [13, Section 6]. For every finite set \(\mathcal {A}\), let \(\mathbf {N}^{\mathcal {A}}\) be the set of multi-indices on \(\mathcal {A}\). For \(\mathcal {A}\)-tuple of Gaussian random variables \(\mathbf {X}= (X_{a})_{a \in \mathcal {A}}\) and \(\mathbf {n}\in \mathbf {N}^{\mathcal {A}}\), we write . Similarly, we write \(\mathbf {n}! = \prod _{a \in \mathcal {A}} n_{a}!\) and \(|\mathbf {n}| = \sum _{a \in \mathcal {A}} n_a\). In general, we use standard letters for scalars and boldface ones to denote tuples.

Fix \(K \ge 1\) and \(m, r \in \mathbf {N}\). Let \([K] = \{1, \dots , K\}\). We also fix \(\theta \in \mathbf {R}\) and \(\mathbf {x}= (x_k)_{k \in [K]} \in \mathbf {R}^K\) arbitrary. Write \(\langle \theta \rangle = 1 + |\theta |\). All the constants C below depend on \(\varLambda \), K, m, and r only unless otherwise mentioned. We seek bounds that are uniform in \(\varepsilon \), \(\theta \), and \(\mathbf {x}\). We also write \(X_j = \varPhi _{\varepsilon }(x_j)\) for simplicity since the bounds will be independent of \(\varepsilon \).

2.1 Clustering and the First Bound

Let \(L>0\) be a fixed large constant whose value, depending on \(\varLambda \), K, m, and r only, will be specified later. Let \(\sim \) be an equivalence relation on [K] such that \(j \sim j'\) if there exists \(k \in \mathbf {N}\) and \(j_0, \dots , j_k \in [K]\) with \(j_0 = j\) and \(j_k = j'\) such that

$$\begin{aligned} |x_{j_{\ell +1}} - x_{j_\ell }| \le L \varepsilon \end{aligned}$$

for all \(\ell = 0, \dots , k-1\). We let \(\mathscr {C}\) denote the partition of [K] into clusters obtained in this way. In other words, j and \(j'\) belong to the same cluster if and only if starting from \(x_{j}\), one can reach \(x_{j'}\) by performing jumps with sizes at most \(L \varepsilon \) onto connecting points in \(\mathbf {x}\).

We distinguish two cases depending on whether \(\mathscr {C}\) contains singletons or not. We first prove (1.5) when \(\mathscr {C}\) has no singleton; that is, every cluster in \(\mathscr {C}\) has at least two elements. In this case, we write down the explicit expression

$$\begin{aligned} \partial _\theta ^r \mathcal {H}_{m} \left( \mathrm{e}^{i \theta X_j} \right) = (i X_j)^{r} \mathrm{e}^{i \theta X_j} - \sum _{n \le m-1} \frac{i^n}{n!} \; \partial _{\theta }^{r} \left( \mathrm{e}^{-\frac{\theta ^2 \mathbf {E}X_j^2}{2}} \theta ^n \right) \; X_{j}^{\diamond n}. \end{aligned}$$
(2.3)

Since \(\mathbf {E}X_j^2 \in [\varLambda ^{-1}, \varLambda ]\), the coefficients \(\partial _\theta ^r (\mathrm{e}^{-\frac{\theta ^2 \mathbf {E}X_j^2}{2}}\theta ^n)\) are uniformly bounded in \(\theta \). We then plug the expansion (2.3) into the left-hand side of (1.5). The Gaussianity of the \(X_j\)’s and boundedness of the coefficients of the removed chaos imply

$$\begin{aligned} \left| \mathbf {E}\prod _{j=1}^{K} \partial _{\theta }^{r} \mathcal {H}_{m} \left( \mathrm{e}^{i \theta X_j} \right) \right| \le C \end{aligned}$$
(2.4)

for some C independent of \(\varepsilon \), \(\theta \), and \(\mathbf {x}\). It then remains to show that the right-hand side of (1.5) is bounded by a constant from below. For this, we use the assumption that \(\mathscr {C}\) contains no singletons.

Let \({\mathfrak {u}}\in \mathscr {C}\) be arbitrary, and we label its elements by \({\mathfrak {u}}= \{j_1, \dots , j_{|{\mathfrak {u}}|}\}\). Since \(\mathscr {C}\) has no singleton set, we necessarily have \(|{\mathfrak {u}}| \ge 2\). By clustering, any two points in \({\mathfrak {u}}\) are at most \(K L \varepsilon \) away from each other. Hence, the assumption (1.4) implies (recalling that \(X_j = \varPhi _\varepsilon (x_j)\))

$$\begin{aligned} \mathbf {E}\left( X_j X_{j'} \right) \ge \frac{\varepsilon ^\alpha }{\varLambda \left( |x_{j}-x_{j'}| + \varepsilon \right) ^{\alpha }} \ge \frac{1}{\varLambda (KL+1)^{\alpha }} \end{aligned}$$

for every two points \(j, j' \in {\mathfrak {u}}\). which implies

$$\begin{aligned} \left( \varLambda (KL+1)^{\alpha } \right) ^{-\left\lfloor \frac{m+1}{2} \right\rfloor |{\mathfrak {u}}|} \le \left( \prod _{\ell =1}^{|{\mathfrak {u}}|} \mathbf {E}\left( X_{j_\ell } X_{j_{\ell +1}} \right) \right) ^{\left\lfloor \frac{m+1}{2} \right\rfloor }, \end{aligned}$$

where we identified \(j_{|{\mathfrak {u}}|+1}\) with \(j_1\). Multiplying the above bound over all \({\mathfrak {u}}\in \mathscr {C}\) and using Wick’s formula and positivity of the correlations, we obtain

$$\begin{aligned} \left( \varLambda (KL+1)^{\alpha } \right) ^{-\left\lfloor \frac{m+1}{2} \right\rfloor K}\le & {} \prod _{{\mathfrak {u}}\in \mathscr {C}} \left( \prod _{\ell =1}^{|{\mathfrak {u}}|} \mathbf {E}\left( X_{j_\ell } X_{j_{\ell +1}} \right) \right) ^{\left\lfloor \frac{m+1}{2} \right\rfloor }\nonumber \\\le & {} \mathbf {E}\prod _{j=1}^{K} \left( X_{j}^{\diamond m} + X_{j}^{\diamond (m+1)} \right) . \end{aligned}$$
(2.5)

This is the place where we use \(|{\mathfrak {u}}| \ge 2\) for every \({\mathfrak {u}}\in \mathscr {C}\), for otherwise the middle term above would contain the variance of a single random variable and the second inequality in (2.5) would be wrong. Combining (2.4) and (2.5), we obtain

$$\begin{aligned} \left| \mathbf {E}\partial _{\theta }^{r} \prod _{j=1}^{K} \mathcal {H}_{m} \left( \mathrm{e}^{i \theta X_j} \right) \right| \le C \left( \varLambda (KL+1)^{\alpha }\right) ^{K \left\lfloor \frac{m+1}{2} \right\rfloor } \cdot \mathbf {E}\prod _{j=1}^{K} \left( X_{j}^{\diamond m} + X_{j}^{\diamond (m+1)} \right) , \end{aligned}$$
(2.6)

which matches the right-hand side of (1.5). Since L (to be chosen later) is also independent of \(\varepsilon \), \(\theta \), and \(\mathbf {x}\), this concludes the case when \(\mathscr {C}\) contains no singleton. The rest of the section is devoted to establishing (1.5) when at least one cluster in \(\mathscr {C}\) is singleton.

2.2 Expansion

Given the collection of points \(\mathbf {x}\) and the clustering above, let

$$\begin{aligned} \mathcal {S}= \left\{ {\mathfrak {u}}\in \mathscr {C}: |{\mathfrak {u}}|=1 \right\} \end{aligned}$$

be the set of singletons in \(\mathscr {C}\), and let \(\mathcal {U}= \mathscr {C}{\setminus } \mathcal {S}\). We write \(s \in \mathcal {S}\) for simplicity if \(\{s\}\) is a singleton set in \(\mathscr {C}\).

For \(\mathbf {X}= (X_j)_{j \in [K]}\) and \(\mathbf {n}= (n_j)_{j \in [K]}\), we write \(\mathbf {X}_{\mathfrak {u}}\) and \(\mathbf {n}_{\mathfrak {u}}\) for their restrictions to \({\mathfrak {u}}\), and . Splitting the left-hand side of (1.5) into sub-products within clusters \({\mathfrak {u}}\in \mathscr {C}\) and chaos expanding each sub-product, we can rewrite it as

$$\begin{aligned} \mathbf {E}\prod _{j=1}^{K} \partial _{\theta }^{r} \mathcal {H}_{m} \left( \mathrm{e}^{i \theta X_j} \right) = \sum _{N \ge 0} \sum _{{\mathop {|\mathbf {n}|=N}\limits ^{\mathbf {n}\in \mathbf {N}^K:}}} \left( \prod _{{\mathfrak {u}}\in \mathscr {C}} C_{\mathbf {n}_{\mathfrak {u}}}(\theta ,\mathbf {X}_{\mathfrak {u}}) \right) \left( \mathbf {E}\prod _{{\mathfrak {u}}\in \mathscr {C}} \mathbf {X}_{{\mathfrak {u}}}^{\diamond \mathbf {n}_{\mathfrak {u}}} \right) . \end{aligned}$$
(2.7)

Here, \(C_{\mathbf {n}_{\mathfrak {u}}}(\theta , \mathbf {X}_{\mathfrak {u}})\) is the coefficient of \(\mathbf {X}_{{\mathfrak {u}}}^{\diamond \mathbf {n}_{\mathfrak {u}}}\) in the chaos expansion of \(\prod _{j \in {\mathfrak {u}}} \partial _\theta ^r \mathcal {H}_{m}(\mathrm{e}^{i\theta X_j})\) and has the expression

$$\begin{aligned} C_{\mathbf {n}_{\mathfrak {u}}}(\theta , \mathbf {X}_{\mathfrak {u}}) = \frac{1}{\mathbf {n}_{\mathfrak {u}}!}\; \mathbf {E}\left( \prod _{j \in {\mathfrak {u}}} \partial _\theta ^r \partial _{X_j}^{n_j} \mathcal {H}_{m}\left( \mathrm{e}^{i\theta X_j} \right) \right) . \end{aligned}$$
(2.8)

Note the product involving the expectation on the right-hand side of (2.7) is 0 if \(|\mathbf {n}|\) is odd. But we still sum over all integers N since this simplifies the notations later. The following lemma gives control on the coefficients \(C_{\mathbf {n}_{\mathfrak {u}}}\).

Lemma 2.2

There exists \(C>0\) depending on K, m, r, and \(\varLambda \) only such that

$$\begin{aligned} |C_{\mathbf {n}_{\mathfrak {u}}}(\theta ,\mathbf {X}_{\mathfrak {u}})| \le \frac{(C \langle \theta \rangle )^{|\mathbf {n}_{\mathfrak {u}}|}}{\mathbf {n}_{\mathfrak {u}}!} \end{aligned}$$
(2.9)

for all \(\mathbf {n}_{\mathfrak {u}}\in \mathbf {N}^{{\mathfrak {u}}}\), where we recall \(\langle \theta \rangle = 1 + |\theta |\). As a consequence, we have

$$\begin{aligned} \sum _{|\mathbf {n}|=N} \prod _{{\mathfrak {u}}\in \mathscr {C}} |C_{\mathbf {n}_{\mathfrak {u}}}(\theta ,\mathbf {X}_{\mathfrak {u}})| \le \frac{(C \langle \theta \rangle )^{N}}{N!}. \end{aligned}$$
(2.10)

Furthermore, if \(\mathcal {S}\ne \emptyset \), then we have

$$\begin{aligned} \sum _{|\mathbf {n}|=N} \prod _{{\mathfrak {u}}\in \mathscr {C}} |C_{\mathbf {n}_{\mathfrak {u}}}(\theta ,X_{\mathfrak {u}})| \le \mathrm{e}^{-\frac{\theta ^2}{2\varLambda }} \cdot \frac{(C \langle \theta \rangle )^{N}}{N!}. \end{aligned}$$
(2.11)

All the bounds are uniform in \(\varepsilon \) and \(\theta \), and in the location of \(\mathbf {x}\) subject to whether \(\mathscr {C}\) contains a singleton or not.

Proof

We express the right-hand side of (2.8) in a way that is convenient to estimate. For this, we introduce variables \(\varvec{\beta }\in \mathbf {R}^{K}\). Let \(\varvec{\beta }_{\mathfrak {u}}\) denote the restriction of \(\varvec{\beta }\) to \({\mathfrak {u}}\), and write \(\partial _{\varvec{\beta }_{\mathfrak {u}}}^{r} = \prod _{j \in {\mathfrak {u}}}\partial _{\beta _j}^{r}\) and \(\varvec{\beta }_{\mathfrak {u}}^{\mathbf {n}_{\mathfrak {u}}} = \prod _{j \in {\mathfrak {u}}} \beta _{j}^{n_j}\). Using the identity

$$\begin{aligned} \partial _{X_j}^{n_j} \mathcal {H}_{m}\left( \mathrm{e}^{i \beta _j X_j} \right) = (i\beta _{j})^{n_j} \mathcal {H}_{m-n_j}\left( \mathrm{e}^{i \beta _j X_j} \right) , \end{aligned}$$

where \(\mathcal {H}_{m-n} = \mathcal {H}_0\) if \(n \ge m\), we can rewrite the right-hand side of (2.8) as

$$\begin{aligned} C_{\mathbf {n}_{\mathfrak {u}}}(\theta , \mathbf {X}_{\mathfrak {u}}) = \frac{i^{|\mathbf {n}_{\mathfrak {u}}|}}{\mathbf {n}_{\mathfrak {u}}!} \partial _{\varvec{\beta }_{\mathfrak {u}}}^{r} \left( \varvec{\beta }_{\mathfrak {u}}^{\mathbf {n}_{\mathfrak {u}}} \cdot \mathbf {E}\prod _{j \in {\mathfrak {u}}} \mathcal {H}_{m-n_j}\left( \mathrm{e}^{i \beta _j X_j} \right) \right) \Big |_{\beta _j=\theta \;, \forall j \in {\mathfrak {u}}}. \end{aligned}$$

When distributing r derivatives of each \(\beta _j\) into the two terms in the parenthesis, the differentiation of the first term (\(\varvec{\beta }_{\mathfrak {u}}^{\mathbf {n}_{\mathfrak {u}}}\)) yields an additional factor which is at most \(|\mathbf {n}_{\mathfrak {u}}|^{Kr} \le C^{|\mathbf {n}_{\mathfrak {u}}|}\) for some fixed constant C, while the second term is uniformly bounded both in \(\theta \) and \(\mathbf {n}_{\mathfrak {u}}\) since \(X_j\)’s all have bounded variance. This gives (2.9). The bound (2.10) follows from (2.9) and the multinomial theorem.

Finally, in order to obtain (2.11) when \(\mathcal {S}\ne \emptyset \), it suffices to note that for \(s \in \mathcal {S}\), we have

$$\begin{aligned} C_{n_s}(\theta ,X_s) = \frac{i^{n_s}}{n_{s}!} \partial _{\theta }^{r} \left( \theta ^{n_s} \mathrm{e}^{-\frac{\theta ^2 \mathbf {E}X_s^2}{2}} \right) \end{aligned}$$

if \(n_s \ge m\), and is 0 otherwise. Since \(\mathbf {E}X_s^2 \ge \frac{1}{\varLambda }\), we gain an additional Gaussian factor \(\mathrm{e}^{-\frac{\theta ^2}{2 \varLambda }}\) for every \(s \in \mathcal {S}\), and hence \(\mathrm{e}^{-\frac{\theta ^2 |\mathcal {S}|}{2 \varLambda }}\) in total. The bound (2.11) then follows from relaxing it to \(\mathrm{e}^{-\frac{\theta ^2}{2 \varLambda }}\). \(\square \)

2.3 Representative Point

For \(|{\mathfrak {u}}| \ge 2\), the corresponding term in the expectation on the right-hand side of (2.7) is a Wick product of multiple Gaussian random variables. We aim to reduce it to the Wick product of a single variable by choosing a representative point from each cluster.

For every \({\mathfrak {u}}\in \mathscr {C}\), choose \(u^{*}({\mathfrak {u}}) \in {\mathfrak {u}}\) arbitrary. The choice for \(u^*\) is unique if \({\mathfrak {u}}\) is singleton. We have the following proposition.

Proposition 2.3

There exists \(C>0\) such that

$$\begin{aligned} \mathbf {E}\left( \prod _{{\mathfrak {u}}\in \mathscr {C}} \mathbf {X}_{{\mathfrak {u}}}^{\diamond \mathbf {n}_{\mathfrak {u}}} \right) \le C^{|\mathbf {n}|} \cdot \mathbf {E}\left( \prod _{{\mathfrak {u}}\in \mathscr {C}} X_{u^*({\mathfrak {u}})}^{\diamond |\mathbf {n}_{\mathfrak {u}}|} \right) \end{aligned}$$
(2.12)

for every \(\mathbf {n}\in \mathbf {N}^K\) and every choice of \(u^*({\mathfrak {u}}) \in {\mathfrak {u}}\).

Proof

If \(|\mathbf {n}| = \sum _{{\mathfrak {u}}\in \mathscr {C}} |\mathbf {n}_{\mathfrak {u}}|\) is odd, then both sides of (2.12) are 0, so we only need to consider the situation when \(|\mathbf {n}|\) is even.

In this case, the left-hand side is the sum of products of pairwise expectations \(\mathbf {E}(X_j X_{j'})\) for j and \(j'\) belonging to different clusters. The right-hand side (without the factor \(C^{|\mathbf {n}|}\)) is the same except that each instance of \(X_j\) for \(j \in {\mathfrak {u}}\) is replaced by \(X_{u^*({\mathfrak {u}})}\). It then suffices to control the effects of such replacements.

Let \({\mathfrak {u}}, \mathfrak {v}\) be two different clusters in \(\mathscr {C}\), and use \(u^*\) and \(v^*\) to denote \(u^*({\mathfrak {u}})\) and \(u^*(\mathfrak {v})\), respectively. For every \(j \in {\mathfrak {u}}\) and \(j' \in \mathfrak {v}\), according to the clustering, we have

$$\begin{aligned} |x_{u^*} - x_{j}| \le (K-1) |x_{j}-x_{j'}|\;, \quad |x_{v^*} - x_{j'}| \le (K-1) |x_{j} - x_{j'}|\;, \end{aligned}$$

which implies

$$\begin{aligned} |x_{u^*} - x_{v^*}| \le |x_{u^*} - x_j| + |x_j - x_{j'}| + |x_{v^*} - x_{j'}| \le 2K |x_{j} - x_{j'}|. \end{aligned}$$

Hence, by (2.1), we deduce that

$$\begin{aligned} \mathbf {E}(X_{j} X_{j'}) \le (2K)^{\alpha } \varLambda ^2 \; \mathbf {E}(X_{u^*} X_{v^*}). \end{aligned}$$

This is the effect of one such replacement. The claim then follows since there are \(\frac{|\mathbf {n}|}{2}\) pairwise expectations in each product in the sum. It also means we can take \(C = (2K)^{\frac{\alpha }{2}} \varLambda \) in (2.12). \(\square \)

From now on, we restrict ourselves to the situation when \(|\mathcal {S}| \ge 1\), and recall the notation \(\mathcal {U}= \mathscr {C}{\setminus } \mathcal {S}\). We need to split the product on the right-hand side of (2.7) into sub-products in \(\mathcal {S}\) and in \(\mathcal {U}\). For this, we introduce multi-indices \(\mathbf {k}=(k_s)_{s \in \mathcal {S}}\) and \(\varvec{\ell }= (\ell _{\mathfrak {u}})_{{\mathfrak {u}}\in \mathcal {U}}\) and write \(|\mathbf {k}| = \sum _{s \in \mathcal {S}} k_s\) and \(|\varvec{\ell }| = \sum _{{\mathfrak {u}}\in \mathcal {U}} \ell _{\mathfrak {u}}\). We then have the following proposition.

Proposition 2.4

Suppose \(|\mathcal {S}| \ge 1\). Then, the left-hand side of (1.5) can be controlled by

$$\begin{aligned} \begin{aligned} \left| \mathbf {E}\prod _{j=1}^{K} \partial _{\theta }^{r} \mathcal {H}_{m} \left( \mathrm{e}^{i \theta X_j} \right) \right|&\le C \mathrm{e}^{-\frac{\theta ^2}{2 \varLambda }} \sum _{N \ge 0} \left( \frac{\left( C \langle \theta \rangle \right) ^{N+m|\mathcal {S}|}}{(N+m|\mathcal {S}|)!} \right. \\&\quad \left. \times \sup _{|\mathbf {k}|+|\varvec{\ell }|=N} \mathbf {E}\left[ \left( \prod _{s \in \mathcal {S}} X_{s}^{\diamond (m+k_s)}\right) \left( \prod _{{\mathfrak {u}}\in \mathcal {U}} X_{u^*({\mathfrak {u}})}^{\diamond \ell _{\mathfrak {u}}}\right) \right] \right) . \end{aligned} \end{aligned}$$
(2.13)

Proof

We start with the expression (2.7). Note that for \(s \in \mathcal {S}\), \(C_{n_s}(\theta , X_s)=0\) whenever \(n_s < m\), so we can relax the expression to

$$\begin{aligned} \left| \mathbf {E}\prod _{j=1}^{K} \partial _{\theta }^{r} \mathcal {H}_{m} \left( \mathrm{e}^{i \theta X_j} \right) \right| \le \sum _{N \ge m|\mathcal {S}|} \left( \sum _{\mathbf {n}} \prod _{{\mathfrak {u}}\in \mathscr {C}} |C_{\mathbf {n}_{\mathfrak {u}}}(\theta ,\mathbf {X}_{\mathbf {n}_{\mathfrak {u}}})| \right) \left( \sup _{\mathbf {n}} \mathbf {E}\prod _{{\mathfrak {u}}\in \mathscr {C}} \mathbf {X}_{{\mathfrak {u}}}^{\diamond \mathbf {n}_{\mathfrak {u}}} \right) , \end{aligned}$$

where both the sum and supremum are taken over \(|\mathbf {n}|=N\) with the further restriction that \(n_s \ge m\) for all \(s \in \mathcal {S}\). The claim follows immediately by applying Lemma 2.2 and Proposition 2.3 to the right-hand side above and noting the range of the sum and supremum. \(\square \)

2.4 Graphic Representation

It remains to control the term involving the expectation on the right-hand side of (2.13). Since all \(X_j\)’s are Gaussian, it can be written as a sum over products of pairwise expectations. The number of terms in each product (and hence the total power) can be arbitrarily large since N will be summed over all integers. Following [13], we introduce graphic notations to describe these objects.

Given a set \(\mathbb {V}\), we write \(\mathbb {V}_2\) for the set of all subsets of \(\mathbb {V}\) with exactly two elements. A (generalized) graph is a triple \(\varGamma = (\mathbb {V}, \mathbb {E}, R)\). Here, \(\mathbb {V}\) is the set of vertices, and \(\mathbb {E}: \mathbb {V}_2 \rightarrow \mathbf {N}\) is the set of edges with multiplicities. More precisely, each edge \(\{x,y\} \in \mathbb {V}_2\) has multiplicity \(\mathbb {E}(x,y) = \mathbb {E}(y,x)\). We do not allow self-loops, so \(\mathbb {E}(x,x)=0\) for all \(x \in \mathbb {V}\). Finally, \(R: \mathbb {V}_2 \rightarrow \mathbf {R}\) is a function that assigns a value to each pair of vertices.

Given a graph \(\varGamma = (\mathbb {V}, \mathbb {E}, R)\), we define the degree of a point \(x \in \mathbb {V}\) and of \(\varGamma \), respectively, by

$$\begin{aligned} \deg (x) := \sum _{y \in \mathbb {V}} \mathbb {E}(x,y)\;, \qquad \deg (\varGamma ) := \sum _{x \in \mathbb {V}} \deg (x). \end{aligned}$$

The value of \(\varGamma \) is defined by

$$\begin{aligned} |\varGamma | := \prod _{e \in \mathbb {V}_2} \left( R(e)\right) ^{\mathbb {E}(e)}. \end{aligned}$$

In what follows, we always take \(\mathbb {V}= \{x_j\}_{j=1}^{K}\) fixed (so is the clusters in \(\mathscr {C}\)), and \(R(x_j, x_{j'}) = \mathbf {E}(X_j X_{j'})\). We also fix the representative points \(u^*({\mathfrak {u}})\) chosen for each \({\mathfrak {u}}\in \mathscr {C}\). Hence, the only variable in our graph is the multiplicity \(\mathbb {E}\) of the edges. Recall the decomposition \(\mathscr {C}= \mathcal {S}\cup \mathcal {U}\) into singletons and clusters with at least two points. We introduce the following definition to characterize the pairings that appear in the expectation term on the right-hand side of (2.13).

Definition 2.5

For each \(\mathbf {k}\in \mathbf {N}^{\mathcal {S}}\) and \(\varvec{\ell }\in \mathbf {N}^{\mathcal {U}}\), the set \(\varOmega _{\mathbf {k},\varvec{\ell }}\) consists of graphs with \(\mathbb {V}\) and R specified above, and such that \(\deg (x_s)=m+k_s\) for all \(s \in \mathcal {S}\), \(\deg (x_{u^*({\mathfrak {u}})})=\ell _{\mathfrak {u}}\) for all \({\mathfrak {u}}\in \mathcal {U}\), and \(\deg (x)=0\) for all other \(x \in \mathbb {V}\).

Let \(\varOmega ^*\) be the set of graphs \(\varGamma \) such that both of the following hold:

  1. 1.

    \(\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}\) for some \(\mathbf {k}\in \mathbf {N}^{\mathcal {S}}\) and \(\varvec{\ell }\in \mathbf {N}^{\mathcal {U}}\) with the restriction that \(k_s \in \{0,1\}\) for every \(s \in \mathcal {S}\) and \(\ell _{\mathfrak {u}}\le m+1\) for each \({\mathfrak {u}}\in \mathcal {U}\).

  2. 2.

    If \(\ell _{\mathfrak {u}}\ge 1\) for some \({\mathfrak {u}}\in \mathcal {U}\), then there exists \(s \in \mathcal {S}\) such that \(\mathbb {E}(x_s, x_{u^*({\mathfrak {u}})}) = \ell _{\mathfrak {u}}\).

Remark 2.6

The first requirement for \(\varOmega ^*\) above is equivalent to that \(\deg (x_s) \in \{m,m+1\}\) for all \(s \in \mathcal {S}\), \(\deg (x_{u^*({\mathfrak {u}})}) \le m+1\) for all \({\mathfrak {u}}\in \mathcal {U}\), and is 0 for all other points. The second requirement says that if \(x_{u^*({\mathfrak {u}})}\) has a nonzero degree, then all its edges must be connected to a single point \(x_s\) for some \(s \in \mathcal {S}\). We will see later that the definition of \(\varOmega ^*\) corresponds to “minimal graphs” after the reduction procedure in the next subsection.

Remark 2.7

The clustering depends on the choice of L, and hence so do the definitions of \(\varOmega _{\mathbf {k},\varvec{\ell }}\) and \(\varOmega ^*\). On the other hand, these are just intermediate steps and our final bound (1.5) does not involve clustering at all. Furthermore, the choice of L later [in (2.18)] is also independent of the location of \(\mathbf {x}\). Hence, we omit the dependence of the clustering on L here for notational simplicity.

2.5 Reduction

We now start to control the right-hand side of (2.13). If \(m|\mathcal {S}| + |\mathbf {k}| + |\varvec{\ell }|\) is odd, then the term with the expectation is 0. So we only need to deal with the case when \(m|\mathcal {S}| + |\mathbf {k}| + |\varvec{\ell }|\) is even.

In that case, the number of different pairings contributing to the expectation in (2.13) is at most \((m|\mathcal {S}|+|\mathbf {k}|+|\varvec{\ell }|-1)!!\), so with Definition 2.5, we have

$$\begin{aligned} \mathbf {E}\left[ \left( \prod _{s \in \mathcal {S}} X_{s}^{\diamond (m+k_s)}\right) \left( \prod _{{\mathfrak {u}}\in \mathcal {U}} X_{u^*({\mathfrak {u}})}^{\diamond \ell _{\mathfrak {u}}}\right) \right] \le (|\mathbf {k}|+|\varvec{\ell }|+m|\mathcal {S}|-1)!! \cdot \sup _{\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}} |\varGamma |.\nonumber \\ \end{aligned}$$
(2.14)

Comparing the above bound and the right-hand side of (2.13), we see that we need to control \(|\varGamma |\) for \(\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}\) with arbitrarily large \(\mathbf {k}\) and \(\varvec{\ell }\). We first bound it by values of the graphs in \(\varOmega ^*\), which is done via a reduction procedure. After that, we enhance the graphs in \(\varOmega ^*\) to match the right-hand side of (1.5) to conclude the proof.

We start with the reduction step. This is where we need to choose the clustering distance L sufficiently large, which will ensure the uniform in \(\theta \) bound after summing over \(\mathbf {k}\) and \(\varvec{\ell }\). We first give the following proposition, which reduces graphs in \(\varOmega _{\mathbf {k},\varvec{\ell }}\) to those in \(\varOmega ^*\).

Proposition 2.8

There exists \(C>0\) depending on \(\varLambda \) only such that

$$\begin{aligned} \max _{\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}} |\varGamma | \le \max _{\varGamma ^* \in \varOmega ^*} \left( \left( C L^{-\alpha }\right) ^{\frac{1}{2} (\deg (\varGamma )-\deg (\varGamma ^*))} \cdot |\varGamma ^*| \right) \end{aligned}$$

for every pair \((\mathbf {k},\varvec{\ell })\) and every \(L>0\). The constant C does not depend on the choice of L, though the clusters and the definitions of \(\varOmega _{\mathbf {k},\varvec{\ell }}\) and \(\varOmega ^*\) do.

Proof

Fix \(\mathbf {k}\in \mathbf {N}^{\mathcal {S}}\), \(\varvec{\ell }\in \mathbf {N}^{\mathcal {U}}\), and \(\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}\) arbitrary. It suffices to show that if \(\varGamma \notin \varOmega ^{*}\), then we can find a \(\bar{\varGamma } \in \varOmega _{\bar{\mathbf {k}}, \bar{\varvec{\ell }}}\) with \(\bar{\mathbf {k}} \le \mathbf {k}\), \(\bar{\varvec{\ell }} \le \varvec{\ell }\) and \(|\bar{\mathbf {k}}|+|\bar{\varvec{\ell }}| < |\mathbf {k}|+|\varvec{\ell }|\) strictly such that

$$\begin{aligned} |\varGamma | \le \left( C L^{-\alpha }\right) ^{\frac{1}{2}(|\mathbf {k}-\bar{\mathbf {k}}|+|\varvec{\ell }-\bar{\varvec{\ell }}|)} |\bar{\varGamma }|. \end{aligned}$$
(2.15)

One can then iterate this bound until the graph is reduced to some \(\varGamma ^* \in \varOmega ^*\) to conclude the proposition. This necessarily happens since each time the total degree of the graph decreases strictly. Here, the inequality on \(\mathbf {k}\) and \(\varvec{\ell }\) means the inequality in each component.

To see the existence of such a \(\bar{\varGamma }\) when \(\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }} {\setminus } \varOmega ^*\), we consider the two situations where either one of the two conditions for \(\varOmega ^*\) in Definition 2.5 is violated. We first consider the violation of Condition 1. Since \(\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}\), failure of Condition 1 means there exists \(j \in \mathcal {S}\cup \{u^*({\mathfrak {u}}): {\mathfrak {u}}\in \mathcal {U}\}\) such that \(\deg (x_j) \ge m+2\). We fix this j, and there are two possibilities in this situation.

Case 1. There exist \(i \ne i'\) such that \(\mathbb {E}(x_j, x_i) \ge 1\) and \(\mathbb {E}(x_j, x_{i'}) \ge 1\). In this case, we let \(\bar{\varGamma }\) be the graph obtained from \(\varGamma \) by performing the following operations:

$$\begin{aligned} \mathbb {E}(x_j, x_{i}) \mapsto \mathbb {E}(x_{j}, x_{i})-1,\; \mathbb {E}(x_j, x_{i'}) \mapsto \mathbb {E}(x_{j}, x_{i'})-1, \mathbb {E}(x_{i},\; x_{i'}) \mapsto \mathbb {E}(x_{i}, x_{i'})+1. \end{aligned}$$

The only point whose degree has been changed in this operation is \(x_j\) (reduced by 2). Hence, we have \(\bar{\varGamma } \in \varOmega _{\bar{\mathbf {k}}, \bar{\varvec{\ell }}}\) with \((\bar{\mathbf {k}},\bar{\varvec{\ell }}) \le (\mathbf {k},\varvec{\ell })\) and \(|\mathbf {k}-\bar{\mathbf {k}}| + |\varvec{\ell }-\bar{\varvec{\ell }}|=2\). To see the bound (2.15), we note that by definition of \(\varOmega _{\mathbf {k},\varvec{\ell }}\), \(x_j\) is at least \(L \varepsilon \) away from both \(x_i\) and \(x_{i'}\). Hence, by (2.2), we have

$$\begin{aligned} \mathbb {E}(X_j X_i) \cdot \mathbb {E}(X_j X_{i'}) \le \frac{2^\alpha \varLambda ^3}{(L+1)^{\alpha }} \cdot \mathbb {E}(X_i X_{i'}). \end{aligned}$$

In graphic notation, this means

where we have omitted x and simply write the indices to denote vertices. Since all the other parts of the graph remain unchanged, this operation gives a desired \(\bar{\varGamma }\) with (2.15).

Case 2.

If for the \(x_j\) that violates Condition 1, all its edges are connected to another point \(x_i\), then we necessarily have \(\deg (x_i) \ge m+2\). Thus, we let \(\bar{\varGamma }\) be the graph obtained from \(\varGamma \) by reducing \(\mathbb {E}(x_j, x_i)\) by two. Then, \(\bar{\varGamma } \in \varOmega _{\bar{\mathbf {k}},\bar{\varvec{\ell }}}\) with \((\bar{\mathbf {k}}, \bar{\varvec{\ell }}) \le (\mathbf {k}, \varvec{\ell })\) but this time \(|\mathbf {k}-\bar{\mathbf {k}}|+|\varvec{\ell }-\bar{\varvec{\ell }}|=4\). Since \(|x_j - x_i| \ge L \varepsilon \), we also have the bound

$$\begin{aligned} |\varGamma | \le \frac{\varLambda ^2}{(L+1)^{2\alpha }} |\bar{\varGamma }|, \end{aligned}$$

which is also of the form (2.15). This completes the treatment of the violation of Condition 1.

We now turn to the situation when Condition 2 is violated. This means there exists \({\mathfrak {u}}\in \mathcal {U}\) such that

  1. (a)

    either \(x_{u^*({\mathfrak {u}})}\) is connected to two other different points \(x_i\) and \(x_{i'}\);

  2. (b)

    or \(x_{u^*({\mathfrak {u}})}\) is connected to \(x_{u^*({\mathfrak {u}}')}\) for some \({\mathfrak {u}}' \in \mathcal {U}\).

For (a), we perform exactly the same operation as Case 1 in the above situation. This will give rise to a graph in \(\varOmega _{\bar{\mathbf {k}},\bar{\varvec{\ell }}}\) with \(\bar{\mathbf {k}}=\mathbf {k}\), \(\bar{\ell }_{\mathfrak {u}}= \ell _{\mathfrak {u}}-2\) and \(\bar{\ell }_{{\mathfrak {u}}'}=\ell _{{\mathfrak {u}}'}\) for all other \({\mathfrak {u}}' \in \mathcal {U}\), and satisfying (2.15). For (b), we simply reduce \(\mathbb {E}(x_{u^*({\mathfrak {u}})} x_{u^*({\mathfrak {u}}')})\) by 1, which results a graph in \(\varOmega _{\bar{\mathbf {k}}, \bar{\varvec{\ell }}}\) with \(|\mathbf {k}-\bar{\mathbf {k}}|+|\varvec{\ell }-\bar{\varvec{\ell }}|=2\) and the desired bound (2.15).

Since the above cases have covered all the possibilities for \(\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }} {\setminus } \varOmega ^*\), we have completed the proof of the proposition. \(\square \)

The following proposition is then a simple consequence.

Proposition 2.9

There exist \(C>0\) and \(L>0\) depending on \(\varLambda \), K, m, and r only such that for every \(\theta \in \mathbf {R}\), every location \(\mathbf {x}\in (\mathbf {R}^d)^K\), and every \(\varepsilon \in (0,1)\), we have the bound

$$\begin{aligned} \left| \mathbf {E}\prod _{j=1}^{K} \partial _{\theta }^{r} \mathcal {H}_{m}\left( \mathrm{e}^{i \theta X_j}\right) \right| \le C \max _{\varGamma \in \varOmega ^*} |\varGamma |. \end{aligned}$$
(2.16)

Remark 2.10

The bound is completely independent of \(\theta \) and \(\varepsilon \), and its dependence on the location of \(\mathbf {x}\) is via \(\varOmega ^*\) only. Also note that the clustering, and hence \(\varOmega ^*\), depends on the choice of L.

Proof of Proposition 2.9

Note that graphs in \(\varOmega _{\mathbf {k},\varvec{\ell }}\) have degree \(|\mathbf {k}|+|\varvec{\ell }| + m|\mathcal {S}|\), so by Proposition 2.8, there exists \(\varGamma ^* \in \varOmega ^*\) such that

$$\begin{aligned} \max _{\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}} |\varGamma | \le (CL^{\frac{\alpha }{2}})^{\deg (\varGamma ^*)} \cdot (CL^{-\alpha })^{\frac{1}{2}(|\mathbf {k}|+|\varvec{\ell }|+m|\mathcal {S}|)} \cdot |\varGamma ^*| \end{aligned}$$

for all \(\mathbf {k}\) and \(\varvec{\ell }\). Combining it with Proposition 2.4 and (2.14), we get

$$\begin{aligned} \left| \mathbf {E}\prod _{j=1}^{K} \partial _{\theta }^{r} \mathcal {H}_{m}\left( \mathrm{e}^{i \theta X_j}\right) \right| \le C L^{\frac{\alpha }{2} \cdot \deg (\varGamma ^*)} \cdot \exp \left( -\frac{\theta ^2}{2 \varLambda } + \frac{C_0 \langle \theta \rangle ^{2}}{L^{\alpha }}\right) \cdot \max _{\varGamma ^* \in \varOmega ^*} |\varGamma ^*|\nonumber \\ \end{aligned}$$
(2.17)

for some constant \(C_0\). One can choose L sufficiently large depending on \(C_0\) and \(\varLambda \) only such that

$$\begin{aligned} \frac{1}{L^{\alpha }} < \frac{1}{4 C_0 \varLambda }. \end{aligned}$$
(2.18)

This guarantees that the exponential term is uniformly bounded in \(\theta \). Since \(C_0\) depends on \(\varLambda \), K, m, and r only, so does L. Finally, \(L^{\frac{\alpha }{2} \cdot \deg (\varGamma ^*)}\) is also uniformly bounded since graphs in \(\varOmega ^*\) have degrees at most \((m+1)K\). This completes the proof. \(\square \)

Remark 2.11

The reason why we need to choose L large is to ensure the exponential and hence the whole right-hand side of (2.17) being uniformly bounded in \(\theta \). As we see now, the Gaussian factor \(\mathrm{e}^{-\frac{\theta ^2}{2 \varLambda }}\) in (2.11) allows us to choose such L being independent of \(\theta \). Together with the enhancement procedure in Sect. 2.6, this ensures the bounds in Proposition 2.9 and hence in Theorem 1.1 are completely independent of \(\theta \).

Without the Gaussian factor, one would need to take L quadratic in \(\theta \) to make the exponential in (2.17) bounded, and the enhancement procedure in below would produce a bound that is polynomial in \(\theta \) with its degree depending on m and K.

2.6 Enhancement and Conclusion of the Proof

From now on, we fix the choice of L in (2.18). We need to control the right-hand side of (2.16) by that of (1.5). To achieve this, we enhance every \(\varGamma \in \varOmega ^*\) to a graph \(\mathrm{Enh}(\varGamma )\) where \(\deg (x_j) \in \{m,m+1\}\) for every \(j \in [K]\), which matches the pairing occurring in the desired upper bound. The enhancement procedure will also be performed in such a way that \(|\mathrm{Enh}(\varGamma )|\) is an upper bound for \(|\varGamma |\) up to some proportionality constant, which is uniform in \(\varepsilon \), \(\theta \), and \(\mathbf {x}\) subject to \(|\mathcal {S}| \ge 1\). This will lead to bound (1.5). The procedure is similar to the one used to obtain (2.6) when \(|\mathcal {S}|=0\).

Fix \(\varGamma \in \varOmega ^*\) arbitrary, so in particular, \(\varGamma \in \varOmega _{\mathbf {k},\varvec{\ell }}\) for some \(\mathbf {k}\in \mathbf {N}^\mathcal {S}\) and \(\varvec{\ell }\in \mathbf {N}^\mathcal {U}\). By the definition of \(\varOmega ^*\), \(\deg (x_s) \in \{m,m+1\}\) for all \(s \in \mathcal {S}\). For every \({\mathfrak {u}}\in \mathcal {U}\), we have \(\deg (x_{u^*})=\ell _{\mathfrak {u}}\le m+1\), and all of them are connected to one single \(x_s\) for some \(s \in \mathcal {S}\) if \(\ell _{\mathfrak {u}}\ge 1\). All other points in \({\mathfrak {u}}\) have degree 0. To construct \(\mathrm{Enh}(\varGamma )\), we add new edges to vertices in \({\mathfrak {u}}\in \mathcal {U}\) and also move around existing edges, but keep \(\deg (x_s)\) unchanged for all \(s \in \mathcal {S}\) throughout the procedure. We do this cluster by cluster and write \(u^*=u^*({\mathfrak {u}})\) for simplicity.

Fix \({\mathfrak {u}}\in \mathcal {U}\) arbitrary. To perform the enhancement operation for \({\mathfrak {u}}\), we let \(s \in \mathcal {S}\) be such that \(x_s\) is the unique singleton point connected to \(x_{u^*({\mathfrak {u}})}\) if \(\ell _{\mathfrak {u}}\ge 1\). This also includes \(\ell _{\mathfrak {u}}=0\), in which case s could be arbitrary. We distinguish several situations depending on the number of points in \({\mathfrak {u}}\).

Case 1. \(|{\mathfrak {u}}| = 2\). Let \(j \ne u^*({\mathfrak {u}})\) denote the other point in \({\mathfrak {u}}\). By definition of \(\varOmega ^*\), we have \(\deg (x_j)=0\). We then perform the following operations. We move \(\left\lfloor (\ell _{\mathfrak {u}}+1)/2 \right\rfloor \) of the \(\ell _{\mathfrak {u}}\) edges between \(x_s\) and \(x_{u^*}\) to connecting \(x_s\) and \(x_j\) and add \(m-\ell _{\mathfrak {u}}\) edges between \(x_{u^*}\) and \(x_{j}\). By clustering, we have \(|x_{u^*}-x_{j}| \le L \varepsilon \) and \(|x_{s}-x_j| \le 2|x_s-x_{u^*}|\). Hence, Proposition 2.1 gives the bounds

$$\begin{aligned} \begin{aligned} (\mathbf {E}X_s X_{u*})^{\ell _{\mathfrak {u}}}&\le C (\mathbf {E}X_s X_{u^*})^{\left\lfloor \frac{\ell _{\mathfrak {u}}}{2} \right\rfloor } (\mathbf {E}X_{s} X_j)^{\left\lfloor \frac{\ell _{\mathfrak {u}}+1}{2} \right\rfloor }, \\ 1&\le C \left( (L+1)^{\alpha } \; \mathbf {E}(X_{u^*}X_{j})\right) ^{m-\left\lfloor \frac{\ell _{\mathfrak {u}}}{2} \right\rfloor }, \end{aligned} \end{aligned}$$

where L as chosen in (2.18) is independent of \(\theta \), \(\varepsilon \), and \(\mathbf {x}\). So in graphic notation, the above operation gives

where the gray area indicates the cluster \({\mathfrak {u}}\), and we have omitted drawing the remaining \((m-\ell _{\mathfrak {u}})\) or \((m+1-\ell _{\mathfrak {u}})\) edges from \(x_s\). We also drop \(|\cdot |\) and simply use the graph itself to denote its value. Then, \(\deg (x_s)=m\) or \(m+1\) is unchanged in the procedure. Furthermore, we have \(\deg (x_{u^*})=m\) and \(\deg (x_j)\in \{m,m+1\}\) after the operation. This also includes the situation \(\ell _{\mathfrak {u}}=0\).

Case 2. \(|{\mathfrak {u}}|=3\). Let ij denote the two other points in \({\mathfrak {u}}\). We then perform the operation

We see \(\deg (x_s)\) is unchanged. One can also check that \(\deg (x_{u^*({\mathfrak {u}})}) = m\) or \(m+1\), and \(\deg (x_i) = \deg (x_j) = m\). So we have the correct degrees of the vertices as well as the desired bound.

Case 3. \(|{\mathfrak {u}}| \ge 4\). We denote the other \(|{\mathfrak {u}}|-1\) points in the cluster by \(j_{1}, \dots , j_{|{\mathfrak {u}}|-1}\). For \(|{\mathfrak {u}}|-2\) of them, say \(x_{j_1}, \dots , x_{j_{|{\mathfrak {u}}|-2}}\), we perform the same operation as in Sect. 2.1 by cyclically connecting them with edges of multiplicities \(\left\lfloor \frac{m+1}{2} \right\rfloor \). This yields the bound

$$\begin{aligned} 1 \le C \left( \mathbf {E}(X_{j_1} X_{j_2}) \cdots \mathbf {E}(X_{j_{|{\mathfrak {u}}|-3}} X_{j_{|{\mathfrak {u}}|-2}}) \mathbf {E}(X_{j_{|{\mathfrak {u}}|-2}} X_{j_1}) \right) ^{\left\lfloor \frac{m+1}{2} \right\rfloor }. \end{aligned}$$

For the remaining points \(u^*\) and \(j_{|{\mathfrak {u}}|-1}\), we perform the same operation as in Case 1 above. This again raises the degrees of all points in \({\mathfrak {u}}\) to m or \(m+1\) with a desired bound.

Every cluster \({\mathfrak {u}}\in \mathcal {U}\) falls into one of the above three cases. The graph \(\mathrm{Enh}(\varGamma )\) is obtained by performing the above operations to all \({\mathfrak {u}}\in \mathcal {U}\). It is clear from the bounds in the above three situations that there exists \(C>0\) such that

$$\begin{aligned} |\varGamma | \le C |\mathrm{Enh}(\varGamma )|. \end{aligned}$$

It is also straightforward to check that in \(\mathrm{Enh}(\varGamma )\), we have \(\deg (x_j) \in \{m,m+1\}\) for all \(j \in [K]\), and hence it represents one of the pairings from the expectation

$$\begin{aligned} \mathbf {E}\prod _{j=1}^{K} \left( X_{j}^{\diamond m} + X_{j+1}^{\diamond (m+1)} \right) . \end{aligned}$$

Hence, we deduce there exists \(C>0\) such that

$$\begin{aligned} |\varGamma | \le C |\mathrm{Enh}(\varGamma )| \le C \mathbf {E}\prod _{j=1}^{K} \left( X_{j}^{\diamond m} + X_{j+1}^{\diamond (m+1)} \right) \end{aligned}$$
(2.19)

for all \(\theta \), \(\varepsilon \), and \(\mathbf {x}\), and this is true for all \(\varGamma \in \varOmega ^*\). Combining (2.19) and Proposition 2.9, we obtain the bound (1.5) in the case \(|\mathcal {S}| \ge 1\).

Since the bound when \(\mathcal {S}=\emptyset \) has already been established in (2.6), we have thus completed the proof of Theorem 1.1.

3 Convergence of the Fields—Proof of Theorem 1.4

We are now ready to prove Theorem 1.4. For notational simplicity, we write \(A \lesssim _{{\alpha }}B\) to denote that \(A \le C B\), where the constant C depends only on the parameter(s) in the subscripts of the symbol \(\lesssim \) (and in this case \(\alpha \)).

In order to apply the bound in Theorem 1.1, we use the convention for \(\widehat{F}\) such that

$$\begin{aligned} F(x) = \int _{\mathbf {R}} \widehat{F}(\theta ) \mathrm{e}^{i \theta x} \mathrm{d} \theta . \end{aligned}$$

But this only appears in intermediate steps, and the final statement does not depend on the definition of \(\widehat{F}\). For every \(\varphi : \mathbf {R}^d \rightarrow \mathbf {R}\), every \(x \in \mathbf {R}^d\), and every \(\lambda >0\), we let

$$\begin{aligned} \varphi _{x}^{\lambda }(y) = \lambda ^{-|\mathfrak {s}|} \varphi \left( \frac{y_1-x_1}{\lambda ^{s_1}}, \dots , \frac{y_d - x_d}{\lambda ^{s_d}} \right) . \end{aligned}$$

Recall that \(\varPsi _\varepsilon = \rho _\varepsilon * \varPsi \), where \(\rho _\varepsilon \) is the rescaled mollifier. Also recall the form of \(a_m\) in (1.7). We first give the convergence criterion.

Proposition 3.1

Let \(\kappa >0\) and \(m < \frac{|\mathfrak {s}|}{\alpha }\). If for every compact \(\mathcal {K}\subset \mathbf {R}^d\) and every \(n \in \mathbf {N}\), we have

$$\begin{aligned} \sup _{\lambda \in (\varepsilon , 1)} \sup _{x \in \mathcal {K}} \sup _{{\mathop {\Vert \varphi \Vert _{\mathcal {C}^{\frac{m \alpha }{2}}} \le 1}\limits ^{\varphi \in \mathcal {C}_{c}^{\infty }(\mathcal {K}):}}} \lambda ^{\frac{m \alpha }{2}+\kappa } \left( \mathbf {E}|\langle \varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m} \left( F(\varepsilon ^{\frac{\alpha }{2}} \varPsi _{\varepsilon }) \right) - a_{m} \varPsi ^{\diamond m}, \varphi _x^\lambda \rangle |^{2n} \right) ^{\frac{1}{2n}} \rightarrow 0 \end{aligned}$$
(3.1)

as \(\varepsilon \rightarrow 0\), then \(\varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m}\big (F(\varepsilon ^{\frac{\alpha }{2}} \varPsi _\varepsilon )\big ) \rightarrow a_m \varPsi ^{\diamond m}\) in \(\mathcal {C}^{-\frac{m\alpha }{2}-\kappa '}\) for every \(\kappa ' > \kappa \).

The proof of the proposition is standard Kolmogorov’s criterion, and so it remains to prove (3.1). By stationarity, we can simply restrict to the case \(x=0\) in (3.1). Writing \(\Vert \cdot \Vert _{2n} := \big (\mathbf {E}|\cdot |^{2n} \big )^{\frac{1}{2n}}\) as well as \(\varPhi _\varepsilon = \varepsilon ^{\frac{\alpha }{2}} \varPsi _\varepsilon \), we need to show for all small \(\kappa \) that

$$\begin{aligned} \lambda ^{\frac{m \alpha }{2}+\kappa } \Vert \langle \varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m} \left( F(\varPhi _\varepsilon )\right) - a_{m} \varPsi ^{\diamond m}, \varphi ^{\lambda }\rangle \Vert _{2n} \rightarrow 0 \end{aligned}$$
(3.2)

as \(\varepsilon \rightarrow 0\), uniformly over \(\lambda \in (\varepsilon , 1)\) and smooth \(\varphi \) supported in a ball of radius 1 such that \(\Vert \varphi \Vert _{\mathcal {C}^{\frac{m \alpha }{2}}} \le 1\). The rest of the section is devoted to the proof of (3.2).

Since \(\varPsi \) is stationary, so is \(\varPsi _{\varepsilon } = \rho _{\varepsilon } * \varPsi \). Hence, \(\varPhi _{\varepsilon }\) has stationary Gaussian distribution \(\mu _{\varepsilon } \sim \mathcal {N}(0,\sigma _\varepsilon ^2)\) with

$$\begin{aligned} \sigma _{\varepsilon }^{2} = \mathbf {E}\varPhi _{\varepsilon }^{2} = \varepsilon ^{\alpha } \int G(x-y) \rho _{\varepsilon }(x) \rho _{\varepsilon }(y) \mathrm{d}x \mathrm{d}y, \end{aligned}$$
(3.3)

where G is the correlation function of \(\varPsi \) as in Assumption 1.2. The coefficient of the m-th term in the chaos expansion of \(F(\varPhi _\varepsilon )\) is given by

$$\begin{aligned} a_{m}^{(\varepsilon )} = \frac{1}{m!} \left( F^{(m)}*\mu _{\varepsilon } \right) (0). \end{aligned}$$

We split the difference \(\varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m}\left( F(\varPhi _{\varepsilon })\right) - a_m \varPsi ^{\diamond m}\) into three parts by

$$\begin{aligned} \begin{aligned} \phantom {111}\varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m}\left( F(\varPhi _{\varepsilon })\right) - a_{m} \varPsi ^{\diamond m}&= \left( \varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m}\left( F(\varPhi _{\varepsilon })\right) - a_{m}^{(\varepsilon )} \varPsi _{\varepsilon }^{\diamond m} \right) \\&\quad + a_{m}^{(\varepsilon )} \left( \varPsi _{\varepsilon }^{\diamond m} - \varPsi ^{\diamond m} \right) + \left( a_{m}^{(\varepsilon )} - a_{m} \right) \varPsi ^{\diamond m}, \end{aligned} \end{aligned}$$
(3.4)

and we show that each of them satisfies a bound of the form of (3.2).

The latter two terms are simpler. For the second one, we notice that \(m < \frac{|\mathfrak {s}|}{\alpha }\) ensures \(\varPsi ^{\diamond m}\) is well defined, and for all sufficiently small \(\kappa \), we have

$$\begin{aligned} \Vert \langle \varPsi _\varepsilon ^{\diamond m} - \varPsi ^{\diamond m}, \varphi ^{\lambda }\rangle \Vert _{2n} \lesssim _{n} \varepsilon ^{\kappa } \lambda ^{-\frac{m \alpha }{2}-\kappa } \end{aligned}$$

uniformly over \(\lambda \in (\varepsilon ,1)\). Also, since \(a_{m}^{(\varepsilon )}\) is uniformly bounded in \(\varepsilon \), it then follows immediately that

$$\begin{aligned} \lambda ^{\frac{m \alpha }{2}+\kappa } \Vert \langle \varPsi _{\varepsilon }^{\diamond m} - \varPsi ^{\diamond m}, \varphi ^{\lambda }\rangle \Vert _{2n} \rightarrow 0 \end{aligned}$$

as \(\varepsilon \rightarrow 0\), uniformly over \(\lambda \in (\varepsilon , 1)\) and \(\varphi \) in the range required in Proposition 3.1.

For the third term, it suffices to notice that assumption (1.6) on G guarantees that \(\sigma _{\varepsilon }^{2} \rightarrow \sigma ^2\), where \(\sigma _\varepsilon ^2\) and \(\sigma ^2\) are given by (3.3) and (1.8). Hence, we immediately have \(a_{m}^{(\varepsilon )} \rightarrow a_m\). The desired bound of the form (3.2) then follows immediately from the boundedness of \(\varPsi ^{\diamond m}\) in \(\mathcal {C}^{-\frac{m \alpha }{2}-\kappa }\).

We now turn to the first term on the right-hand side of (3.4), which requires the use of the bound in Theorem 1.1. We first note that the covariance of \(\varPhi _\varepsilon = \varepsilon ^{\frac{\alpha }{2}} \rho _{\varepsilon } * \varPsi \) has the form

$$\begin{aligned} \mathbf {E}\left( \varPhi _\varepsilon (x) \varPhi _\varepsilon (y) \right) = (\rho _{\varepsilon }^{\star 2} \star G)(x-y), \end{aligned}$$

where \(\star \) denotes the forward convolution in the sense that \((f \star g)(x) = \int f(x+y) g(y) \mathrm{d}y\). Assumption 1.2 on G guarantees that \(\varPhi _{\varepsilon }\) satisfies the assumption (1.4) in Theorem 1.1 with some \(\varLambda >1\). Since \(\varPsi _{\varepsilon }^{\diamond m} = \varepsilon ^{-\frac{m\alpha }{2}} \varPhi _\varepsilon ^{\diamond m}\), and \(a_{m}^{(\varepsilon )}\) is precisely the m-th coefficient in the chaos expansion of \(F(\varPhi _{\varepsilon })\), we have

$$\begin{aligned} \varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m}\left( F(\varPhi _{\varepsilon })\right) - a_{m}^{(\varepsilon )} \varPsi _{\varepsilon }^{\diamond m} = \varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m+1}\left( F(\varPhi _{\varepsilon })\right) . \end{aligned}$$
(3.5)

We leave aside the factor \(\varepsilon ^{-\frac{m \alpha }{2}}\) and focus on \(\mathcal {H}_{m+1} \big ( F(\varPhi _\varepsilon ) \big )\) for a moment. Fourier expanding F and changing the order of integration, we get the identity

$$\begin{aligned} \langle \mathcal {H}_{m+1}\left( F(\varPhi _\varepsilon ) \right) , \varphi ^{\lambda }\rangle = \langle \widehat{F}, \mathcal {A}\varPhi _\varepsilon \rangle _{\theta }, \end{aligned}$$

where

$$\begin{aligned} (\mathcal {A}\varPhi _\varepsilon )(\theta ) = (\mathcal {A}_{\varphi , \lambda ,m} \varPhi _\varepsilon )(\theta ) = \int _{\mathbf {R}^d} \mathcal {H}_{m+1} \left( \mathrm{e}^{i \theta \varPhi _{\varepsilon }(x)} \right) \varphi ^{\lambda }(x) \mathrm{d}x, \end{aligned}$$
(3.6)

and the subscript \(\theta \) on the inner product indicates that the testing is taken with respect to the Fourier variable \(\theta \in \mathbf {R}\). We now omit the subscripts in \(\mathcal {A}\) for simplicity. Recall that \(\mathcal {I}_k = (k-1,k+1)\). Multiplying \(\AA \varPhi _{\varepsilon }\) by a partition of unity subordinate to the intervals \(\{\mathcal {I}_k\}_{k \in \mathbf {Z}}\) and separating these terms, we get the bound

$$\begin{aligned} |\langle \mathcal {H}_{m+1}\left( F(\varPhi _\varepsilon ) \right) , \varphi ^{\lambda }\rangle | \le C_M \sum _{k \in \mathbf {Z}} \Vert \widehat{F}\Vert _{M,\mathcal {I}_k} \sup _{0 \le r \le M} \sup _{\theta \in \mathcal {I}_k} |(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|, \end{aligned}$$

where \(M \in \mathbf {N}\) is as in Assumption 1.3. Now, taking 2n-th moments on both sides and using triangle inequality, we get

$$\begin{aligned} \Vert \langle \mathcal {H}_{m+1}\left( F(\varPhi _\varepsilon ) \right) , \varphi ^{\lambda }\rangle \Vert _{2n} \le C_M \sum _{k \in \mathbf {Z}} \Vert \widehat{F}\Vert _{M,\mathcal {I}_k} \left( \mathbf {E}\sup _{0 \le r \le M} \sup _{\theta \in \mathcal {I}_k} |(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|^{2n} \right) ^{\frac{1}{2n}}. \end{aligned}$$
(3.7)

There are two suprema inside the expectation. The first supremum is taken over \(M+1\) elements, so we can move it out of \(\big ( \mathbf {E}|\cdot |^{2n} \big )^{\frac{1}{2n}}\) at the cost of a constant multiple depending on M and n only. The second one is taken over an interval, so we need the following lemma to interchange it with the expectation.

Lemma 3.2

Suppose f is a random \(\mathcal {C}^1\) function on an interval \(\mathcal {I}\). For every \(p \ge 1\), there exists C depending on p and \(|\mathcal {I}|\) only such that

$$\begin{aligned} \mathbf {E}\sup _{\theta \in \mathcal {I}} |f(\theta )|^{p} \le C_{p,|\mathcal {I}|} \sup _{\theta \in \mathcal {I}} \mathbf {E}\left( |f(\theta )|^{p} + |f'(\theta )|^{p} \right) . \end{aligned}$$

Proof

Fix \(\theta _0 \in \mathcal {I}\) arbitrary. By fundamental theorem of calculus and Hölder’s inequality, we have

$$\begin{aligned} |f(\theta )| \le |f(\theta _0)| + |\mathcal {I}|^{1-\frac{1}{p}} \left( \int _{\mathcal {I}} |f'(x)|^{p} \mathrm{d}x \right) ^{\frac{1}{p}}. \end{aligned}$$

Raising both sides to p-th power, we get

$$\begin{aligned} \sup _{\theta } |f(\theta )|^{p} \le C \left( |f(\theta _0)|^{p} + \int _{\mathcal {I}} |f'(\theta )|^{p} \mathrm{d}\theta \right) , \end{aligned}$$

where C depends on p and \(\mathcal {I}\). The assertion then follows by taking expectation on both sides and noting that

$$\begin{aligned} \mathbf {E}\int _{\mathcal {I}} |f'(\theta )|^{p} \mathrm{d} \theta = \int _{\mathcal {I}} \mathbf {E}|f'(\theta )|^{p} \mathrm{d} \theta \le |\mathcal {I}| \sup _{\theta \in \mathcal {I}} \mathbf {E}|f'(\theta )|^{p}. \end{aligned}$$

This completes the proof of the lemma. \(\square \)

Using Lemma 3.2 to interchange the expectation and supremum, we have

$$\begin{aligned} \mathbf {E}\sup _{0 \le r \le M} \sup _{\theta \in \mathcal {I}_k} |(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|^{2n} \lesssim _{n,M} \sup _{0 \le r \le M+1} \sup _{\theta \in \mathcal {I}_k} \mathbf {E}|(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|^{2n}, \end{aligned}$$

where the supremum over r is taken over \(r \le M+1\) to include the one additional derivative required in the interchange. Plugging it back into the right-hand side of (3.7), we obtain

$$\begin{aligned} \Vert \langle \mathcal {H}_{m+1}\left( F(\varPhi _\varepsilon )\right) , \varphi ^\lambda \rangle \Vert _{2n} \lesssim _{n,M} \sum _{k \in \mathbf {Z}} \Vert \widehat{F}\Vert _{M,\mathcal {I}_k} \sup _{0 \le r \le M+1} \sup _{\theta \in \mathcal {I}_k} \left( \mathbf {E}|(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|^{2n} \right) ^{\frac{1}{2n}}. \end{aligned}$$
(3.8)

It then remains to control the quantity \(\mathbf {E}|(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|^{2n}\). Recalling the expression of \(\mathcal {A}\varPhi _\varepsilon \) in (3.6), we have

$$\begin{aligned} \mathbf {E}|(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|^{2n} = \int _{(\mathbf {R}^d)^{2n}} \left( \mathbf {E}\prod _{j=1}^{2n} \partial _{\theta }^{r} \mathcal {H}_{m+1}\left( \mathrm{e}^{i \theta \varPhi _{\varepsilon }(x_j)} \right) \right) \cdot \left( \prod _{j=1}^{2n} \varphi ^{\lambda }(x_j) \right) \mathrm{d}\mathbf {x}, \end{aligned}$$

where we used the shorthand notation \(\mathbf {x}=(x_1, \dots , x_{2n})\). We now apply Theorem 1.1 to the expectation part above, so that we get

$$\begin{aligned} \mathbf {E}\prod _{j=1}^{2n} \partial _{\theta }^{r} \mathcal {H}_{m+1}\left( \mathrm{e}^{i \theta \varPhi _{\varepsilon }(x_j)} \right) \lesssim _{r} \mathbf {E}\prod _{j=1}^{2n} \left( \varPhi _{\varepsilon }^{\diamond (m+1)}(x_j) + \varPhi _{\varepsilon }^{\diamond (m+2)}(x_j) \right) . \end{aligned}$$

Plugging it into the integral on the right-hand side above and using the identity

$$\begin{aligned} \begin{aligned}&\phantom {11}\int \left[ \mathbf {E}\prod _{j=1}^{2n} \left( \varPhi _{\varepsilon }^{\diamond (m+1)}(x_j) + \varPhi _{\varepsilon }^{\diamond (m+2)}(x_j) \right) \right] \cdot \left( \prod _{j=1}^{2n} \varphi ^{\lambda }(x_j) \right) \mathrm{d}\mathbf {x}\\&\qquad = \mathbf {E}|\langle \varPhi _\varepsilon ^{\diamond (m+1)}+\varPhi _\varepsilon ^{\diamond (m+2)}, \varphi ^\lambda \rangle |^{2n}, \end{aligned} \end{aligned}$$

we get

$$\begin{aligned} \left( \mathbf {E}|(\mathcal {A}\varPhi _\varepsilon )^{(r)}(\theta )|^{2n} \right) ^{\frac{1}{2n}} \lesssim _{r,n} \sum _{\ell =1}^{2} \left( \mathbf {E}|\langle \varPhi _\varepsilon ^{\diamond (m+\ell )}, \varphi ^\lambda \rangle |^{2n}\right) ^{\frac{1}{2n}}. \end{aligned}$$
(3.9)

In particular, the bound is uniform in both \(\varepsilon \) and \(\theta \). We have the following lemma controlling the right-hand side above.

Lemma 3.3

For every integer \(\ell \ge 1\) and every sufficiently small \(\kappa \), we have

$$\begin{aligned} \left( \mathbf {E}|\langle \varPhi _\varepsilon ^{\diamond (m+\ell )}, \varphi ^\lambda \rangle |^{2n}\right) ^{\frac{1}{2n}} \lesssim _{n,\ell } \varepsilon ^{\frac{m \alpha }{2}+\kappa } \lambda ^{-\frac{m \alpha }{2}-\kappa } \end{aligned}$$

uniformly over \(\varepsilon , \lambda \in (0,1)\).

Proof

Since \(\varPhi _\varepsilon \) is Gaussian, by equivalence of moments, the left-hand side above can be controlled by

$$\begin{aligned} \left( \mathbf {E}|\langle \varPhi _\varepsilon ^{\diamond (m+\ell )}, \varphi ^\lambda \rangle |^{2n}\right) ^{\frac{1}{2n}} \lesssim _{n} \left( \mathbf {E}|\langle \varPhi _\varepsilon ^{\diamond (m+\ell )}, \varphi ^\lambda \rangle |^{2}\right) ^{\frac{1}{2}}, \end{aligned}$$

so we only need to bound the second moment. By Wick’s formula, we have

$$\begin{aligned} \mathbf {E}|\langle \varPhi _\varepsilon ^{\diamond (m+\ell )}, \varphi ^\lambda \rangle |^{2} = (m+\ell )! \int \left( \mathbf {E}\varPhi _{\varepsilon }(x) \varPhi _{\varepsilon }(y) \right) ^{m+\ell } \varphi ^{\lambda }(x) \varphi ^{\lambda }(y) \mathrm{d}x \mathrm{d}y. \end{aligned}$$

Since \(\ell \ge 1\), we have

$$\begin{aligned} \left( \mathbf {E}\varPhi _\varepsilon (x) \varPhi _\varepsilon (y) \right) ^{m+\ell } \le \frac{\varLambda ^{m+\ell } \varepsilon ^{\alpha (m+\ell )}}{(|x-y|+\varepsilon )^{\alpha (m+\ell )}} \le \frac{\varLambda ^{m+\ell } \varepsilon ^{m \alpha +\kappa }}{|x-y|^{m \alpha + \kappa }} \end{aligned}$$

for all \(\kappa \in (0,\ell \alpha )\). For all \(\kappa \) sufficiently small such that \(m \alpha + \kappa < |\mathfrak {s}|\), the singularity on the right-hand side above is integrable, so we have

$$\begin{aligned} \mathbf {E}|\langle \varPhi _\varepsilon ^{\diamond (m+\ell )}, \varphi ^\lambda \rangle |^{2} \lesssim \varepsilon ^{m \alpha + \kappa } \int \frac{\varphi ^{\lambda }(x) \varphi ^{\lambda }(y)}{|x-y|^{m\alpha +\kappa }} \mathrm{d}x \mathrm{d}y \lesssim \varepsilon ^{m \alpha +\kappa } \lambda ^{-m\alpha -\kappa }. \end{aligned}$$

The proof is complete by taking square roots on both sides and replacing \(\kappa \) by \(2\kappa \). \(\square \)

Now, combining (3.8), (3.9) and Lemma 3.3, and using Assumption 1.3 on F, we get

$$\begin{aligned} \begin{aligned} \Vert \langle \mathcal {H}_{m+1}\left( F(\varPhi _\varepsilon )\right) , \varphi ^\lambda \rangle \Vert _{2n}&\lesssim _{n,M} \varepsilon ^{\frac{m\alpha }{2}+\kappa } \lambda ^{-\frac{m\alpha }{2}-\kappa } \sum _{k \in \mathbf {Z}} \Vert \widehat{F}\Vert _{M,\mathcal {I}_k}\\&\lesssim _{n,M} \varepsilon ^{\frac{m\alpha }{2}+\kappa } \lambda ^{-\frac{m\alpha }{2}-\kappa }. \end{aligned} \end{aligned}$$

Substituting the above bound back to (3.5), we deduce that

$$\begin{aligned} \lambda ^{\frac{m\alpha }{2}+\kappa }\Vert \langle \varepsilon ^{-\frac{m \alpha }{2}} \mathcal {H}_{m}\left( F(\varPhi _{\varepsilon })\right) - a_{m}^{(\varepsilon )} \varPsi _{\varepsilon }^{\diamond m}, \varphi ^\lambda \rangle \Vert _{2n} \lesssim _{n,M} \varepsilon ^{\kappa }, \end{aligned}$$

which is the desired bound. The proof of Theorem 1.4 is thus complete.