1 Introduction

Let \(\eta \) be a Poisson process on a measurable space \((\mathbb {X},\mathcal {X})\) with a \(\sigma \)-finite intensity measure \(\lambda \), defined on some probability space \((\Omega ,\mathcal {F},\mathbb {P})\). Formally, \(\eta \) is a point process, which is a random element of the space \(\textbf{N}\) of all \(\sigma \)-finite measures on \(\mathbb {X}\) with values in \(\mathbb {N}_0\cup \{\infty \}\), equipped with the smallest \(\sigma \)-field \(\mathcal {N}\) making the mappings \(\mu \mapsto \mu (B)\) measurable for each \(B\in \mathcal {X}\). The Poisson process \(\eta \) is completely independent, that is, \(\eta (B_1),\ldots ,\eta (B_n)\) are independent for pairwise disjoint \(B_1,\ldots ,B_n\in {\mathcal {X}}\), \(n\in \mathbb {N}\), and \(\eta (B)\) has for each \(B\in \mathcal {X}\) a Poisson distribution with parameter \(\lambda (B)\), see, for example, [7, 14].

Let \(G:\textbf{N}\times \mathbb {X}\rightarrow \mathbb {R}\) be a measurable function which is square integrable with respect to \(\mathbb {P}_\eta \otimes \lambda \), where \(\mathbb {P}_\eta {:}{=}\mathbb {P}(\eta \in \cdot )\) denotes the distribution of \(\eta \). In this paper, we study the Kabanov–Skorohod integral (short: KS-integral) of G defined as a Malliavin operator. If G is in the domain of the KS-integral and integrable with respect to \(\mathbb {P}_\eta \otimes \lambda \), its KS-integral is pathwise given by

$$\begin{aligned} \varvec{\delta }(G)=\int G_x(\eta -\delta _x)\,\eta (dx)-\int G_x(\eta )\,\lambda (dx), \end{aligned}$$
(1.1)

where \(\delta _x\) stands for the Dirac measure at \(x\in \mathbb {X}\), see, for example, [10, Theorem 6]. In this case, the Mecke formula immediately yields that \(\mathbb {E}\varvec{\delta }(G)=0\). We refer to [10] for an introduction to stochastic calculus on a general Poisson space.

The pathwise representation (1.1) of the KS-integral consists of two terms. The first term is the sum of the values \(G_x(\eta -\delta _x)\) over the points of \(\eta \). Such sums have been intensively studied. The state of the art of limit theorems for such sums is presented in [9], based on the idea of stabilisation. The stabilisation property means that the functional \(G_x(\eta -\delta _x)\) depends only on points of \(\eta \) within some finite random distance from x, with conditions imposed on the distribution of such a distance. As in [9], we use recent developments of the Malliavin–Stein technique for Poisson processes, first elaborated in [15] and then extended in [5, 8, 13, 22].

In all above-mentioned works, the sums over Poisson processes are centred by subtracting the expectation, which is

$$\begin{aligned} \mathbb {E}\int G_x(\eta -\delta _x)\,\eta (dx) = \int \mathbb {E}G_x(\eta ) \,\lambda (dx). \end{aligned}$$

In contrast, the centring involved in the pathwise construction of the KS-integral in (1.1) is random. As shown in [12], KS-integrals naturally appear in the construction of unbiased estimators derived from Poisson hull operators.

In this paper, we derive bounds for the Wasserstein and the Kolmogorov distance between \(\varvec{\delta }(G)\) and a standard normal random variable. Limit theorems for compensated stochastic Poisson integrals in the Wasserstein distance have been studied in several papers by N. Privault, assuming that \(\mathbb {X}\) is the Euclidean space \(\mathbb {R}^d\) with separate treatments of the cases \(d=1\) in [20] and \(d\ge 2\) in [19]. In [20], the integrand is assumed to be adapted, and in [19], it is assumed to be predictable and to have bounded support. In particular, the stochastic integral coincides in both cases with the KS-integral. Under these assumptions, the tools, based on derivation operators and Edgeworth-type expansions, have resulted in bounds involving integrals of the third power of G and differential operators applied to G. In comparison, our results apply to a general state space, are not restricted to predictable (or adapted) integrands and do not assume the support of the integrand to be bounded in any sense. Furthermore, our bounds are given in terms of difference operators directly applied to the integrand G, and are derived for both the Wasserstein and the Kolmogorov distance. However, our bounds contain the integral of the absolute value of G to power 3, which may be larger than the corresponding term in [19]. Our results are used in [12] to derive quantitative central limit theorems.

Let us compare our proof strategy with the standard approach for the normal approximation of Poisson functionals via the Malliavin–Stein method, which goes back to [15] and is also employed in [5, 8, 13, 22]. To this end, we omit all technical assumptions and definitions (some will be given later). Let F be a Poisson functional (a measurable function of \(\eta \)), and let f be the solution of the associated Stein equation. The identity \(\varvec{\delta }D=-L\), where D is the difference operator and L is the Ornstein–Uhlenbeck generator with its inverse \(L^{-1}\), and integration by parts lead to

$$\begin{aligned} \mathbb {E}F f(F) = \mathbb {E}\varvec{\delta }(-DL^{-1}F) f(F) = \mathbb {E}\int (-D_xL^{-1}F) D_xf(F) \, \lambda (dx). \end{aligned}$$

This step comes for the price of the term \(D_xL^{-1}F\), which is often difficult to evaluate and whose treatment is one of the main achievements of [13]. For the special case \(F=\varvec{\delta }(G)\), the identity \(\varvec{\delta }D=-L\) is not required. Instead, an immediate integration by parts yields that

$$\begin{aligned} \mathbb {E}F f(F) = \mathbb {E}\varvec{\delta }(G) f(\varvec{\delta }(G)) = \mathbb {E}\int G_x D_xf(\varvec{\delta }(G)) \, \lambda (dx), \end{aligned}$$
(1.2)

avoiding the inverse Ornstein–Uhlenbeck generator. We treat the KS-integrals that arise from the Taylor expansion of \(D_xf(\varvec{\delta }(G))\) also by integration by parts, so that our final bounds only involve G and its difference operators but no KS-integrals. This is a difference to [26], where the argument in (1.2) is used but no further integration by parts.

Even though our proofs differ from previous works, one may wonder whether existing Malliavin–Stein bounds can be applied to \(\varvec{\delta }(G)\). As they do not involve the inverse Ornstein–Uhlenbeck generator, the results from [13] seem to be the best ones for the off-the-shelf use. They require only moments of the first and the second difference operator of the Poisson functional F, which one might also encounter when evaluating the bounds from [5, 8, 15, 22]. In our case, this means that one has to control moments like \(\mathbb {E}\big [ G_x^4 \big ]\), \(\mathbb {E}\big [ \varvec{\delta }(D_xG)^4 \big ]\) and \(\mathbb {E}\big [ \varvec{\delta }(D_{x,y}^2G)^4 \big ]\) for \(x,y\in \mathbb {X}\). Since we aim for bounds in terms of G and its difference operators, one has to remove the KS-integrals. This can be achieved by fourfold integration by parts, but would lead to normal approximation bounds that are more involved than in the current paper and include even iterated integrals with roots of the inner integrals. We expect these results to yield the same rates of convergence as our approach but under stronger integrability assumptions. Instead, our approach is direct and leads to much simpler calculations. In particular, it does not require the computation of expressions involving powers of KS-integrals apart from second moments.

Section 2 presents our main results, which are proved in Sects. 4 and 5 separately for the Wasserstein and Kolmogorov distances, after recalling necessary results and constructions from stochastic calculus on Poisson spaces in Sect. 3. We conclude with two examples in Sects. 6 and 7 concerning some linear statistics of point processes constructed via Poisson embeddings and Pareto optimal points.

2 Main Results

To state our results, we need to introduce some notation. The Wasserstein distance between the laws of two integrable random variables X and Y is defined by

$$\begin{aligned} d_W(X,Y){:}{=}\sup _{h\in {{\text {\textbf{Lip}}}}(1)} \big |\mathbb {E}[h(X)]-\mathbb {E}[h(Y)]\big |, \end{aligned}$$

where \({{\text {\textbf{Lip}}}}(1)\) denotes the space of all Lipschitz functions \(h:\mathbb {R}\rightarrow \mathbb {R}\) with a Lipschitz constant at most one. The Kolmogorov distance between the laws of X and Y is given by

$$\begin{aligned} d_K(X,Y){:}{=}\sup _{t \in \mathbb {R}} |\mathbb {P}(X\le t)-\mathbb {P}(Y\le t)|. \end{aligned}$$

Given a function \(f:\textbf{N}\rightarrow \mathbb {R}\) and \(x\in \mathbb {X}\), the function \(D_xf:\textbf{N}\rightarrow \mathbb {R}\) is defined by

$$\begin{aligned} D_xf(\mu ){:}{=}f(\mu +\delta _x)-f(\mu ),\quad \mu \in \textbf{N}. \end{aligned}$$
(2.1)

Then \(D_x\) is known as the difference operator. Iterating its definition yields, for given \(x,z,w\in \mathbb {X}\), the second difference operator \(D^2_{x,z}\) and the third difference operator \(D^3_{x,z,w}\) which can again be applied to functions f as above. For a function \(G:\textbf{N}\times \mathbb {X}\rightarrow \mathbb {R}\) (which maps \((\mu ,y)\) to \(G_y(\mu )\)) and \(x,z,w\in \mathbb {X}\), we let \(D_x\), \(D^2_{x,z}\) and \(D^3_{x,z,w}\) act on \(G_y(\cdot )\) so that it makes sense to talk about \(D_xG_y(\mu )\), \(D^2_{x,z}G_y(\mu )\) and \(D^3_{x,z,w}G_y(\mu )\). Throughout the paper, we write shortly \(G_y\) for \(G_y(\eta )\) and similarly for difference operators.

We shall require the following integrability assumptions:

$$\begin{aligned}&\mathbb {E}\int G^2_y\,\lambda (dy)<\infty , \end{aligned}$$
(2.2)
$$\begin{aligned}&\mathbb {E}\int (D_xG_y)^2\,\lambda ^2(d(x,y))<\infty , \end{aligned}$$
(2.3)
$$\begin{aligned}&\mathbb {E}\int (D^2_{z,x} G_y)^2\,\lambda ^3(d(x,y,z))<\infty , \end{aligned}$$
(2.4)
$$\begin{aligned}&\mathbb {E}\int (D^3_{w,z,x} G_y)^2\,\lambda ^3(d(w,y,z))<\infty ,\quad \lambda \hbox { -a.e.}\ x. \end{aligned}$$
(2.5)

If (2.2) and (2.3) hold, it follows from [13, Proposition 2.3] that the KS–integral \(\varvec{\delta }(G)\) of G is defined as a Malliavin operator and satisfies

$$\begin{aligned} {{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G) = \mathbb {E}\int G_x^2 \, \lambda (dx) + \mathbb {E}\int D_xG_y D_yG_x \, \lambda ^2(d(x,y)). \end{aligned}$$
(2.6)

In order to deal with the Kolmogorov distance, we also need to assume that

$$\begin{aligned}&\mathbb {E}\int |D_xG_y G_x|\,\lambda ^2(d(x,y))<\infty , \end{aligned}$$
(2.7)
$$\begin{aligned}&\mathbb {E}\int \big (D_z(G_x|G_x|)\big )^2 \, \lambda ^2(d(x,z))<\infty , \end{aligned}$$
(2.8)
$$\begin{aligned}&\mathbb {E}\int \bigg ( \int D_z(D_xG_y D_y|G_x|) \, \lambda (dy) \bigg )^2 \, \lambda ^2(d(x,z))<\infty . \end{aligned}$$
(2.9)

The following main result on the normal approximation of \(\varvec{\delta }(G)\) involves only the integrand G and its first-, second- and third-order difference operators. Throughout the paper, we let N denote a standard normal random variable. Define and denote

$$\begin{aligned} T_1&{:}{=}\left( \mathbb {E}\int \left( \int D_y(G_x^2) \, \lambda (dx)\right) ^2\, \lambda (dy)\right) ^{1/2}, \\ T_2&{:}{=}\left( \mathbb {E}\int \left( \int D_z(D_xG_y D_yG_x) \, \,\lambda ^2(d(x,y))\right) ^2 \,\lambda (dz)\right) ^{1/2}, \\ T_3&{:}{=} \mathbb {E}\int |G_x|^3 \, \lambda (dx), \\ T_4&{:}{=} \mathbb {E}\int \Big ( 3 \big |D_xG_y D_yG_x G_x\big | + \big |D_xG_y (D_yG_x)^2\big | + 2G_x^2 \big |D_xG_y\big | \\&\qquad \qquad + \big | (G_x + D_yG_x) D_xG_y \big | \big (2|G_y|+|D_xG_y+D_yG_x|\big ) \Big ) \, \lambda ^2(d(x,y)), \\ T_5&{:}{=} \mathbb {E}\int \bigg ( 2\big (|D_yG_z|+|D^2_{x,y}G_z|\big ) \Big (|D_z\big ((G_x + D_yG_x) D_xG_y\big )| + 2|(G_x + D_yG_x) D_xG_y| \Big ) \\&\qquad \qquad + |D_xG_z| \Big (|D_z\big (D_yG_x D_xG_y\big )| + 2|D_yG_x D_xG_y| \Big ) \bigg ) \, \lambda ^3(d(x,y,z)), \\ T_6&{:}{=} \bigg (\mathbb {E}\int \bigg ( \int (G_x+D_yG_x) D_xG_y \, \lambda (dx)\bigg )^2\, \lambda (dy) \\&\qquad \qquad +\mathbb {E}\int \bigg ( \int D_z\big ((G_x+D_yG_x) D_xG_y\big ) \, \lambda (dx)\bigg )^2 \lambda ^2(d(y,z)) \bigg )^{1/2}, \\ T_7&{:}{=} \bigg ( \mathbb {E}\int G_x^4 \, \lambda (dx) + \mathbb {E}\int D_x(G_y|G_y|) D_y(G_x|G_x|) \, \lambda ^2(d(x,y)) \bigg )^{1/2}, \\ T_8&{:}{=} \bigg ( \mathbb {E}\int \bigg ( \int D_xG_y D_y|G_x| \, \lambda (dy) \bigg )^2\, \lambda (dx) \\&\qquad \qquad + \mathbb {E}\int D_x\bigg ( \int D_zG_y D_y|G_z| \, \lambda (dy) \bigg ) D_z\bigg ( \int D_xG_y D_y|G_x| \lambda (dy) \bigg ) \, \lambda ^2(d(x,z))\bigg )^{1/2}, \\ T_9&{:}{=} \bigg ( 3\mathbb {E}\int (D_xG_y)^2 \big (D_y|G_x|+|G_x|\big )^2 \, \lambda ^2(d(x,y)) \\&\qquad \qquad + 3 \mathbb {E}\int \Big ( D_z\big ( D_xG_y (D_y|G_x|+|G_x|) \big ) \Big )^2 \, \lambda ^3(d(x,y,z)) \\&\qquad \qquad + 2 \mathbb {E}\int \Big ( D^2_{z,w}\big ( D_xG_y (D_y|G_x|+|G_x|)\big ) \Big )^2 \, \lambda ^4(d(x,y,z,w)) \bigg )^{1/2}. \end{aligned}$$

Theorem 1.1

Suppose that \(G:\textbf{N}\times \mathbb {X}\rightarrow \mathbb {R}\) satisfies assumptions (2.2), (2.3), (2.4) and (2.5). Assume also that \(\mathbb {E}\varvec{\delta }(G)^2=1\). Then

$$\begin{aligned} d_W(\varvec{\delta }(G),N)\le T_1+T_2+T_3+T_4+T_5. \end{aligned}$$
(2.10)

If, additionally, (2.7), (2.8) and (2.9) are satisfied, then

$$\begin{aligned} d_K(\varvec{\delta }(G),N)\le T_1+T_2+T_6+2(T_7+T_8+T_9). \end{aligned}$$
(2.11)

We say that the functional G satisfies the cyclic condition of order two if

$$\begin{aligned} D_xG_yD_yG_x=0 \quad \text { a.s. } \quad \text { for } \quad \lambda ^2\text { -a.e. } \quad (x,y)\in \mathbb {X}^2, \end{aligned}$$
(2.12)

see [18], where such conditions were used to simplify moment formulae for the KS-integral. Note that (2.12) always holds if the functional G is predictable, that is, the carrier space is equipped with a strict partial order \(\prec \) and \(G_y(\eta )\) depends only on \(\eta \) restricted to \(\{x\in \mathbb {X}:x\prec y\}\). If (2.12) holds, then also

$$\begin{aligned} D_x|G_y| D_y|G_x| = 0 \quad \text { and } \quad D_xG_y D_y|G_x| = 0 \quad \text { a.s. } \quad \text { for } \quad \lambda ^2\text { -a.e. } \quad (x,y)\in \mathbb {X}^2, \end{aligned}$$

since

$$\begin{aligned} 0 \le \big | D_x|G_y| D_y|G_x| \big |&= \big | D_x|G_y|\big | \big |D_y|G_x| \big | \le |D_xG_y| \big |D_y|G_x| \big | \le |D_xG_y| |D_yG_x| \\&= 0. \end{aligned}$$

In view of this, under the cyclic condition, the bounds from Theorem 2.1 simplify as follows.

Corollary 1.2

Assume that the cyclic condition (2.12) holds, and the assumptions of Theorem 2.1 are maintained. Then the bounds (2.10) and (2.11) hold with \(T_2=T_8=0\), and

$$\begin{aligned} T_4&=\mathbb {E}\int \Big (2G_x^2 |D_xG_y| + \big | G_x D_xG_y\big | \big (2|G_y|+|D_xG_y|\big )\Big ) \, \lambda ^2(d(x,y)),\\ T_5&=\mathbb {E}\int 2\Big (|D_yG_z|+|D^2_{x,y}G_z|\Big ) \Big (\big |D_z(G_x D_xG_y)\big | + 2 |G_xD_xG_y| \Big ) \, \lambda ^3(d(x,y,z)),\\ T_6&=\bigg ( \mathbb {E}\int \bigg ( \int G_x D_xG_y \, \lambda (dx)\bigg )^2 \, \lambda (dy) + \mathbb {E}\int \bigg ( \int D_z\big (G_x D_xG_y\big ) \, \lambda (dx)\bigg )^2 \, \lambda ^2(d(y,z)) \bigg )^{1/2},\\ T_7&= \bigg ( \mathbb {E}\int G_x^4 \, \lambda (dx) \bigg )^{1/2},\\ T_9&= \bigg ( 3\mathbb {E}\int (D_xG_y)^2 G_x^2 \, \lambda ^2(d(x,y)) + 3 \mathbb {E}\int \big ( D_z( D_xG_y |G_x| ) \big )^2 \, \lambda ^3(d(x,y,z)) \\&\qquad \qquad + 2 \mathbb {E}\int \big ( D^2_{z,w}(D_xG_y |G_x|) \big )^2 \, \lambda ^4(d(x,y,z,w)) \bigg )^{1/2}. \end{aligned}$$

Remark 2.3

Assuming that \(G_x(\eta )\equiv f(x)\) does not depend on \(\eta \) and that \(f\in L^2(\lambda )\), \(\varvec{\delta }(G)\) is the first Wiener–Itô integral \(I_1(f)\) of f (see, for example, [14, Chapter 12]). In this case, Theorem 2.1 yields the classical Stein bounds for the Wasserstein and the Kolmogorov distance,

$$\begin{aligned} d_W(I_1(f),N) \le \int |f(x)|^3 \, \lambda (dx) \end{aligned}$$

and

$$\begin{aligned} d_K(I_1(f),N) \le 2\bigg (\int f(x)^4 \, \lambda (dx) \bigg )^{1/2}, \end{aligned}$$

see, for example, [15, Corollary 3.4] and [13, Example 1.3].

Remark 2.4

The paper [26] studies normal and Poisson approximation of innovations of general point processes with Papangelou conditional intensities, which include KS-integrals on the Poisson space. More precisely, Theorem 3.1 and Corollary 3.2 in [26] (with \(\pi =1\) there) provide bounds on the Wasserstein distance between a KS-integral and a standard normal random variable. In contrast to our main results, the bound in Theorem 3.1 from [26] contains still KS-integrals as integration by parts is employed only once. Proceeding there with further integrations by parts might be challenging since one of the KS-integrals is within an absolute value. The bound on the Wasserstein distance presented in Theorem 3.1 is evaluated in Corollary 3.2, but the resulting bound might not always behave as desired for a limit theorem. The first term on the right-hand side can be bounded from below by

$$\begin{aligned} \bigg |1- \mathbb {E}\int G_x^2 \, \lambda (dx)\bigg |, \end{aligned}$$

which does not become small if the KS-integral has variance one and the second term in (2.6) has a non-vanishing contribution (see Example 6.5 for such a situation). The third term contains only a product of two factors, which could be not sufficient if one rescales by the standard deviation of the KS-integral (see, for example, the situation discussed in Remarks 6.1 and 6.4 under the additional assumptions that u is constant and \(\varphi \) is translation invariant in its first argument).

Remark 2.5

In view of the works [16, 23], we expect that our results can be extended to the multivariate normal approximation of vectors of KS-integrals for distances based on smooth test functions and for the so-called \(d_{\textrm{convex}}\)-distance under suitable assumptions.

3 Preliminaries

In this section, we provide some basic properties of the difference operator D and the KS-integral \(\varvec{\delta }\). First of all, we recall from [10] the definitions of D and \(\varvec{\delta }\) as Malliavin operators. These definitions are based on nth-order Wiener–Itô integrals \(I_n\), \(n\in \mathbb {N}\); see also [14, Chapter 12]. For symmetric functions \(f\in L^2(\lambda ^n)\) and \(g\in L^2(\lambda ^m)\) with \(n,m\in \mathbb {N}\), we have

$$\begin{aligned} \mathbb {E}I_n(f) I_m(g) = {\textbf{1}}\{n=m\} n! \int f(x) g(x) \, \lambda ^n(dx). \end{aligned}$$
(3.1)

We use the convention \(I_0(c)=c\) for \(c\in \mathbb {R}\). Any \(H\in L^2(\mathbb {P}_\eta )\) admits a chaos expansion

$$\begin{aligned} H=\sum ^\infty _{n=0} I_n(h_n), \end{aligned}$$
(3.2)

where we recall our (somewhat sloppy) convention \(H\equiv H(\eta )\), and where \(h_0=\mathbb {E}H\) and the \(h_n\), \(n\in \mathbb {N}\), are symmetric elements of \(L^2(\lambda ^n)\). Here and in the following, we mean by series of Wiener–Itô integrals their \(L^2\)-limit, whence all identities involving such sums hold almost surely. Then H is in the domain \({{\,\textrm{dom}\,}}D\) of the difference operator D (in the sense of a Malliavin operator) if

$$\begin{aligned} \sum ^\infty _{n=1}nn!\int h_n(x_1,\ldots ,x_n)^2\,\lambda ^n(d(x_1,\ldots ,x_n))<\infty . \end{aligned}$$

In this case, one has

$$\begin{aligned} D_xH=\sum ^\infty _{n=1}nI_{n-1}(h_n(x,\cdot )),\quad \lambda \text { -a.e. } \ x\in \mathbb {X}, \end{aligned}$$

see [10, Theorem 3], i.e. the pathwise defined difference operator from (2.1) can be represented in terms of the chaos expansion (3.2). For \(H\in L^2(\mathbb {P}_\eta )\), the relations \(H\in {{\,\textrm{dom}\,}}D\) and

$$\begin{aligned} \mathbb {E}\int (D_xH)^2\,\lambda (dx)<\infty \end{aligned}$$

are equivalent; see [10, Eq. (48)]. The (pathwise defined) difference operator satisfies the product rule

$$\begin{aligned} D_x(HH')=(D_xH)(H+D_xH')+HD_xH',\quad x\in \mathbb {X}, \end{aligned}$$
(3.3)

for measurable \(H,H':\textbf{N}\rightarrow \mathbb {R}\).

Now let \(G:\textbf{N}\times \mathbb {X}\rightarrow \mathbb {R}\) be a measurable function such that \(G_x\equiv G(\cdot ,x)\in L^2(\mathbb {P}_\eta )\) for \(\lambda \)-a.e. x. Then there exist measurable functions \(g_n:\mathbb {X}^{n+1}\rightarrow \mathbb {R}\), \(n\in \mathbb {N}_0\), such that

$$\begin{aligned} G_x=\sum _{n=0}^\infty I_n(g_n(x,\cdot )), \quad \lambda \text { -a.e.} x\in \mathbb {X}. \end{aligned}$$
(3.4)

One says that G is in the domain \({{\,\textrm{dom}\,}}\varvec{\delta }\) of the KS-integral \(\varvec{\delta }\) if

$$\begin{aligned} \sum _{n=0}^\infty (n+1)! \int {\tilde{g}}_n({\textbf{x}})^2\,\lambda ^{n+1}(d{\textbf{x}})<\infty , \end{aligned}$$

where \({\tilde{g}}_n:\mathbb {X}^{n+1}\rightarrow \mathbb {R}\) is the symmetrisation of \(g_n\). In this case, the KS-integral of G is defined by

$$\begin{aligned} \varvec{\delta }(G){:}{=}\sum _{n=0}^\infty I_{n+1}({\tilde{g}}_n). \end{aligned}$$
(3.5)

We have \(\mathbb {E}\varvec{\delta }(G)=0\). If \(G\in {{\,\textrm{dom}\,}}\varvec{\delta }\cap L^1(\mathbb {P}_\eta \otimes \lambda )\), then \(\varvec{\delta }(G)\) is indeed given by the pathwise formula (1.1); see [10, Theorem 6]. If \(G\in L^2(\mathbb {P}_\eta \otimes \lambda )\), which is (2.2), and if (2.3) holds, then \(G\in {{\,\textrm{dom}\,}}\varvec{\delta }\) and

$$\begin{aligned} \mathbb {E}\varvec{\delta }(G)^2 =\mathbb {E}\int G_x^2\,\lambda (dx)+\mathbb {E}\int D_xG_y D_yG_x \,\lambda ^2(d(x,y)), \end{aligned}$$
(3.6)

see [13, Proposition 2.3] or [10, Theorem 5]. Thus, assumptions (2.2) and (2.3) on G in Theorem 2.1 are sufficient to guarantee that \(G\in {\text {dom}}\varvec{\delta }\).

For \(H\in {{\,\textrm{dom}\,}}D\) and \(G\in {{\,\textrm{dom}\,}}\varvec{\delta }\), we have the important integration by parts formula

$$\begin{aligned} \mathbb {E}H \varvec{\delta }(G)=\mathbb {E}\int G_x D_xH \,\lambda (dx); \end{aligned}$$
(3.7)

see, for example, [10, Theorem 4]. Unfortunately, the assumption \(H\in {{\,\textrm{dom}\,}}D\) is often not easy to check, and the sufficient conditions given above lead to rather strong integrability assumptions. Instead we shall often use the following two results.

Lemma 1.6

Suppose that G satisfies (2.2) and (2.3), and let \(H\in L^2(\mathbb {P}_\eta )\) be such that \(D_xH\in L^2(\mathbb {P}_\eta )\) for \(\lambda \)-a.e. x. Then

$$\begin{aligned} \int |\mathbb {E}D_xH G_x|\,\lambda (dx)<\infty \end{aligned}$$
(3.8)

and (3.7) holds.

Proof

The proof is essentially that of Lemma 2.3 in [22]. For the convenience of the reader, we provide the main arguments. Since \(H\in L^2(\mathbb {P}_\eta )\), we can represent H as in (3.2). Similarly, we can write

$$\begin{aligned} D_xH=\sum ^\infty _{n=0} I_n(h'_n(x,\cdot )), \quad \lambda \text {-a.e.} \ x, \end{aligned}$$

where the measurable functions \(h'_n:\mathbb {X}^{n+1}\rightarrow \mathbb {R}\) are in the last n coordinates symmetric and square integrable with respect to \(\lambda ^n\). In fact, it follows from [14, Theorem 18.10] that we can choose

$$\begin{aligned} h'_n(x,{\textbf{x}})=(n+1)h_{n+1}(x,{\textbf{x}}). \end{aligned}$$

Combining this with (3.4) and (3.1), we obtain

$$\begin{aligned} \mathbb {E}D_xH G_x=\sum ^\infty _{n=0}(n+1)! \int h_{n+1}(x,{\textbf{x}})g_{n}(x,{\textbf{x}})\,\lambda ^n(d{\textbf{x}}) \end{aligned}$$

for \(\lambda \)-a.e. x. The Cauchy–Schwarz inequality (applied twice) yields

$$\begin{aligned} \int&|\mathbb {E}D_xH G_x|\,\lambda (dx)\\&\le \Bigg (\sum ^\infty _{n=0}(n+1)!\int h_{n+1}({\textbf{x}})^2\,\lambda ^{n+1}(d{\textbf{x}})\Bigg )^{1/2} \Bigg (\sum ^\infty _{n=0}(n+1)!\int g_{n}({\textbf{x}})^2\,\lambda ^{n+1}(d{\textbf{x}})\Bigg )^{1/2}. \end{aligned}$$

Since \(\mathbb {E}H^2<\infty \), the first factor on the above right-hand side is finite. By assumption (2.3), the second factor is finite as well; see the proof of [10, Theorem 5]. Hence, (3.8) holds. The remainder of the proof is as in [22]. \(\square \)

Lemma 1.7

Suppose that G satisfies (2.2) and (2.3), and let \(H:\textbf{N}\rightarrow \mathbb {R}\) be a measurable function satisfying

$$\begin{aligned} \mathbb {E}|H \varvec{\delta }(G)|<\infty . \end{aligned}$$
(3.9)

Then

$$\begin{aligned} |\mathbb {E}H \varvec{\delta }(G)|\le \mathbb {E}\int |D_xH G_x|\,\lambda (dx). \end{aligned}$$
(3.10)

Proof

If H is bounded, then (3.10) follows from Lemma 3.1. In the general case, we set \(H_r{:}{=}\max \{\min \{H,r\}),-r\}\) for \(r>0\). Then (3.10) holds with \(H_r\) instead of H. Hence, the observation that \(|D_xH_r|\le |D_xH|\) for \(x\in \mathbb {X}\) (see [14, Exercise 18.4]) yields that

$$\begin{aligned} |\mathbb {E}H_r \varvec{\delta }(G)|\le \mathbb {E}\int |D_xH G_x|\,\lambda (dx). \end{aligned}$$

By (3.9), we can conclude the assertion from dominated convergence. \(\square \)

We often need the following (basically) well-known commutation rule for the KS-integral. For the pathwise defined version (1.1), this rule follows (under suitable integrability assumptions) by direct calculation.

Lemma 1.8

Suppose that G satisfies (2.2), (2.3) and (2.4). Then \(\varvec{\delta }(G)\in {{\,\textrm{dom}\,}}D\) and \(D_xG\in {{\,\textrm{dom}\,}}\varvec{\delta }\) for \(\lambda \)-a.e. x as well as

$$\begin{aligned} D_x\varvec{\delta }(G)=G_x+\varvec{\delta }(D_xG)\quad \hbox { a.s.},\, \lambda \text { -a.e. }\ x\in \mathbb {X}. \end{aligned}$$
(3.11)

Proof

We have already noticed at (3.6) that (2.2) and (2.3) imply \(G\in {{\,\textrm{dom}\,}}\varvec{\delta }\). Next we show that \(\varvec{\delta }(G)\in {{\,\textrm{dom}\,}}D\). Assumptions (2.2) and (2.3) ensure that \(G_x\in {{\,\textrm{dom}\,}}D\) for \(\lambda \)-a.e. x. Representing G as in (3.4) and using [10, Theorem 3] twice, we can write

$$\begin{aligned} D^2_{y,z}G_x=\sum ^\infty _{n=0}(n+2)(n+1) I_{n}(g_{n+2}(x,y,z,\cdot )), \quad \lambda ^2\hbox { -a.e. }\ (y,z)\in \mathbb {X}^2. \end{aligned}$$

By the \(L^2\)-convergence of the right-hand side and (3.1), we obtain

$$\begin{aligned} \mathbb {E}&\int (D^2_{y,z} G_x)^2\,\lambda ^3(d(x,y,z))\\&=\sum ^\infty _{n=0}(n+2)^2(n+1)^2n! \iint g_{n+2}(x,y,z,{\textbf{x}})^2\,\lambda ^n(d {\textbf{x}}) \,\lambda ^3(d(x,y,z))\\&=\sum ^\infty _{n=0}(n+2)(n+1)(n+2)! \int g_{n+2}({\textbf{x}})^2\,\lambda ^{n+3}(d {\textbf{x}}). \end{aligned}$$

By assumption (2.4), this is finite, which is equivalent to

$$\begin{aligned} \sum ^\infty _{n=2}n (n-1) n! \int g_{n}({\textbf{x}})^2\,\lambda ^{n+1}(d {\textbf{x}})<\infty . \end{aligned}$$

In view of (3.5) and the inequalities

$$\begin{aligned} \int {\tilde{g}}_{n}({\textbf{x}})^2\,\lambda ^{n+1}(d {\textbf{x}}) \le \int g_{n}({\textbf{x}})^2\,\lambda ^{n+1}(d {\textbf{x}}) \end{aligned}$$

(a consequence of Jensen’s inequality), this yields that \(\varvec{\delta }(G)\in {{\,\textrm{dom}\,}}D\).

Let \(G'\) be another measurable function satisfying (2.2) and (2.3). It follows from (3.6) and the polarisation identity that

$$\begin{aligned} \mathbb {E}\varvec{\delta }(G)\varvec{\delta }(G') =\mathbb {E}\int G_xG_x'\,\lambda (dx)+\mathbb {E}\int D_xG_y D_yG'_x \,\lambda ^2(d(x,y)). \end{aligned}$$
(3.12)

The integration by parts formula (3.7) yields that

$$\begin{aligned} \mathbb {E}\varvec{\delta }(G)\varvec{\delta }(G')=\mathbb {E}\int G'_x D_x\varvec{\delta }(G)\,\lambda (dx). \end{aligned}$$

Assumptions (2.3) and (2.4) show that \(D_xG\in {{\,\textrm{dom}\,}}\varvec{\delta }\) for \(\lambda \)-almost all x and that \(\varvec{\delta }(D_\cdot G)\) belongs to \(L^2(\mathbb {P}\otimes \lambda )\) (see (3.6) and the discussion before it). Therefore, we obtain from Fubini’s theorem and integration by parts that

$$\begin{aligned} \mathbb {E}\iint D_xG_y D_yG'_x \,\lambda (dy)\,\lambda (dx) =\mathbb {E}\int G'_x \varvec{\delta }(D_xG)\,\lambda (dx), \end{aligned}$$

where we could apply Fubini’s theorem on the left-hand side due to (2.3) and on the right-hand side by the Cauchy–Schwarz inequality and the square integrability of \(G'\) and \(\varvec{\delta }(D_\cdot G)\). Inserting these two results into (3.12) yields

$$\begin{aligned} \mathbb {E}\int G'_x D_x\varvec{\delta }(G)\,\lambda (dx) =\mathbb {E}\int G'_x G_x \,\lambda (dx) +\mathbb {E}\int G'_x \varvec{\delta }(D_xG)\,\lambda (dx). \end{aligned}$$

Since the class of functions \(G'\) with the required properties is dense in \(L^2(\mathbb {P}_\eta \otimes \lambda )\) (see, for example, the proof of [10, Theorem 5]), we conclude the asserted formula (3.11). \(\square \)

4 Proof for the Wasserstein Distance in Theorem 2.1

Our proof is similar to the proofs of Theorems 1.1 and 1.2 in [13] and relies on the ideas already present in [15]. The first step is to recall Stein’s method. Let \({\textbf{C}}_{1,2}\) be the set of all twice continuously differentiable functions \(g:\mathbb {R}\rightarrow \mathbb {R}\) whose first derivative is bounded in absolute value by 1 and the second derivative by 2. Then we have for an integrable random variable X that

$$\begin{aligned} d_W(X,N)\le \sup _{g\in {\textbf{C}}_{1,2}}|\mathbb {E}[g'(X)-Xg(X)]|. \end{aligned}$$

Let the function G satisfy the assumptions of Theorem 2.1 and write \(X{:}{=}\varvec{\delta }(G)\). By the definition of the KS-integral we can write \(X\equiv X(\eta )\) as a measurable function of \(\eta \). Let \(g\in {\textbf{C}}_{1,2}\). Then we have for \(\lambda \)-a.e. \(x\in \mathbb {X}\) and a.s. that

$$\begin{aligned} D_xg(X)=g(X(\eta +\delta _x))-g(X(\eta ))=g(X+D_xX)-g(X). \end{aligned}$$
(4.1)

Since g is Lipschitz (by the boundedness of its first derivative) and \(X\in {{\,\textrm{dom}\,}}D\) by Lemma 3.3, it follows that \(|D_xg(X)|\le |D_xX|\), so that Dg(X) (considered as a function on \(\textbf{N}\times \mathbb {X}\)) is square integrable with respect to \(\mathbb {P}_\eta \otimes \lambda \). Since, moreover, it is clear that g(X) is square integrable, we have in particular that \(g(X)\in {{\,\textrm{dom}\,}}D\). The integration by parts formula (3.7) yields that

$$\begin{aligned} \mathbb {E}Xg(X)=\mathbb {E}\int G_xD_xg(X)\,\lambda (dx). \end{aligned}$$
(4.2)

Since \(G\in L^2(\mathbb {P}_\eta \otimes \lambda )\) and \(X\in {{\,\textrm{dom}\,}}D\), we obtain from the Lipschitz continuity of g and the Cauchy–Schwarz inequality that

$$\begin{aligned} \mathbb {E}\int |G_xD_xg(X)|\,\lambda (dx) \le \mathbb {E}\int |G_x| |D_xX|\,\lambda (dx) <\infty . \end{aligned}$$
(4.3)

We have that

$$\begin{aligned} D_xg(X)&= g(X+D_xX)-g(X) = \int _X^{X+D_xX} g'(t) \, dt \\ {}&= D_xX \int _0^1 g'(X+sD_xX) \, ds. \end{aligned}$$

Our assumptions on G allow to apply the commutation rule (3.11) to \(D_xX\), yielding a.s. and for \(\lambda \)-a.e. x that

$$\begin{aligned} G_xD_xg(X)&=G_x D_xX \int _0^1 g'(X+sD_xX) \, ds\\&=\int _0^1 G_x (G_x + \varvec{\delta }(D_xG)) g'(X+sD_xX) \, ds \\&= \int _0^1 G_x^2 g'(X+sD_xX) \, ds + \int _0^1 G_x \varvec{\delta }(D_xG) g'(X+sD_xX) \, ds \\&=:S_1(x)+S_2(x). \end{aligned}$$

In view of \(|g'|\le 1\), (3.11), (2.2) and (4.3), we can note that

$$\begin{aligned} \mathbb {E}\int \int ^1_0 |G_x \varvec{\delta }(D_xG) g'(X+sD_xX)| \,ds\,\lambda (dx)&\le \mathbb {E}\int |G_x| (|D_xX| + |G_x|) \,\lambda (dx)\nonumber \\&<\infty . \end{aligned}$$
(4.4)

We obtain

$$\begin{aligned} |\mathbb {E}[g'(X)-Xg(X)]|&\le \bigg | \mathbb {E}g'(X) \bigg ( 1 - \int G_x^2 \, \lambda (dx) - \int D_xG_y D_y G_x \, \lambda ^2(d(x,y)) \bigg ) \bigg | \\&\quad + \bigg | \mathbb {E}\int \big (g'(X) G_x^2-S_1(x)\big ) \,\lambda (dx) \bigg | \\&\quad + \bigg | \mathbb {E}g'(X) \int D_xG_y D_y G_x \, \lambda ^2(d(x,y)) - \mathbb {E}\int S_2(x)\,\lambda (dx) \bigg | \\&=: U_0 + U_1 + U_2. \end{aligned}$$

Since \(\mathbb {E}\varvec{\delta }(G)^2=1\), Jensen’s inequality and (3.6) yield that

$$\begin{aligned} U_0&\le \mathbb {E}\Big | 1- \int G_x^2 \, \lambda (dx) - \int D_xG_y D_y G_x \, \lambda ^2(d(x,y)) \Big | \\&\le \left( {{\,\mathrm{{\mathbb Var}}\,}}\int G_x^2 \, \lambda (dx) \right) ^{1/2} + \left( {{\,\mathrm{{\mathbb Var}}\,}}\int D_xG_y D_y G_x \, \lambda ^2(d(x,y)) \right) ^{1/2}. \end{aligned}$$

It follows from the Poincaré inequality (see [14, Section 18.3]) that

$$\begin{aligned} {{\,\mathrm{{\mathbb Var}}\,}}\int G_x^2 \, \lambda (dx) \le \mathbb {E}\int \bigg (\int D_y(G_x^2) \, \lambda (dx)\bigg )^2 \, \lambda (dy) = T_1^2 \end{aligned}$$

and

$$\begin{aligned} {{\,\mathrm{{\mathbb Var}}\,}}\int D_xG_y D_y G_x \, \lambda ^2(d(x,y)){} & {} \le \mathbb {E}\int \bigg ( \int D_z\big ( D_xG_y D_y G_x \big ) \, \lambda ^2(d(x,y))\bigg )^2 \, \lambda (dz) \\{} & {} = T_2^2, \end{aligned}$$

whence

$$\begin{aligned} U_0 \le T_1+T_2. \end{aligned}$$
(4.5)

We now turn to \(U_1\). We note first that, by \(|g'|\le 1\) and (2.2),

$$\begin{aligned} \mathbb {E}\int \int ^1_0 G_x^2\big |g'(X)-g'(X+sD_xX)\big |\, ds \, \lambda (dx)<\infty . \end{aligned}$$

Because of

$$\begin{aligned} g'(X+sD_xX)-g'(X) = s D_xX \int _0^1 g''(X+stD_xX) \, dt =: D_xX H(s,x) \end{aligned}$$
(4.6)

for \(x\in \mathbb {X}\) and \(s\in [0,1]\), we have that

$$\begin{aligned} U_1&= \bigg |\mathbb {E}\int \int _0^1 G_x^2(g'(X+sD_xX) - g'(X)) \, ds \, \lambda (dx)\bigg | \nonumber \\&= \bigg |\mathbb {E}\int \int _0^1 G_x^2 D_xX H(s,x) \, ds \, \lambda (dx)\bigg |\nonumber \\&\le \bigg |\mathbb {E}\int \int _0^1 G_x^2 G_x H(s,x) \, ds\, \lambda (dx)\bigg | \nonumber \\&\quad +\bigg |\mathbb {E}\int \int _0^1 G_x^2 \varvec{\delta }(D_xG) H(s,x) \, ds \, \lambda (dx)\bigg |, \end{aligned}$$
(4.7)

where we have used the commutation rule (3.11) in the last step. To justify the linearity of the integration, we can assume without loss of generality that

$$\begin{aligned} T_3=\mathbb {E}\int |G_x|^3\, \lambda (dx)<\infty \end{aligned}$$

and use that \(|g''|\le 2\). The latter inequality yields that \(|H(s,x)|\le 2s\) and

$$\begin{aligned} \bigg | \mathbb {E}\int \int _0^1 G_x^2 G_x H(s,x) \, ds \, \lambda (dx) \bigg | \le \mathbb {E}\int \int _0^1 |G_x|^3 |H(s,x)| \, ds \, \lambda (dx) \le T_3. \end{aligned}$$

To treat the term (4.7), we first use \(|\varvec{\delta }(D_xG)|\le |D_xX|+|G_x|\) for \(x\in \mathbb {X}\) (see (3.11)), (4.6) and the preceding integrability properties to conclude that

$$\begin{aligned} \begin{aligned}&\mathbb {E}\int \int ^1_0 G_x^2 |\varvec{\delta }(D_xG) H(s,x)|\, ds \, \lambda (dx) \\&\le \mathbb {E}\int \int ^1_0 |G_x|^3 |H(s,x)|\, ds \, \lambda (dx) + \mathbb {E}\int \int ^1_0 G_x^2 |D_xX H(s,x)|\, ds \, \lambda (dx) \\&= \mathbb {E}\int \int ^1_0 |G_x|^3 |H(s,x)|\, ds \, \lambda (dx) + \mathbb {E}\int \int ^1_0 G_x^2 |g'(X)-g'(X+sD_xX)|\, ds \, \lambda (dx) <\infty . \end{aligned} \end{aligned}$$
(4.8)

Therefore, we obtain from Fubini’s theorem that

$$\begin{aligned} U_1\le T_3+ \int \int ^1_0 |\mathbb {E}G_x^2 \varvec{\delta }(D_xG) H(s,x)|\, ds \, \lambda (dx). \end{aligned}$$

The expectation on the above right-hand side can be bounded with Lemma 3.2 applied to \(H{:}{=}G_x^2H(s,x)\) and with \(D_xG\) instead of G (justified by (2.3), (2.4), and (4.8)). This gives

$$\begin{aligned} U_1&\le T_3+ \int \int ^1_0 \mathbb {E}|D_xG_y| \big |D_y\big (G_x^2 H(s,x)\big )\big |\, ds\, \lambda ^2(d(x,y)) \\&\le T_3+\mathbb {E}\int |D_xG_y| (|D_y(G_x^2)| + 2G_x^2)\,\lambda ^2(d(x,y)), \end{aligned}$$

where we used (3.3), \(|D_yH(s,x)+H(s,x)|\le 2s\), and \(|D_yH(s,x)|\le 4s\).

Now we turn to the term \(U_2\). Define \(R_x{:}{=}\int ^1_0g'(X+sD_xX)\,ds\), \(x\in \mathbb {X}\). By the integrability property (4.4) and Fubini’s theorem,

$$\begin{aligned} \mathbb {E}\int S_2(x)\,\lambda (dx) =\int \mathbb {E}\varvec{\delta }(D_xG)G_xR_x \,\lambda (dx). \end{aligned}$$

By Lemma 3.1, whose assumptions are satisfied for \(\lambda \)-a.e. x by (2.2)–(2.4) and \(|g'|\le 1\), and the product rule (3.3),

$$\begin{aligned} \mathbb {E}&\int S_2(x)\,\lambda (dx) =\int \int \mathbb {E}D_xG_yD_y(G_xR_x) \,\lambda (dy) \, \lambda (dx)\\&=\int \int (\mathbb {E}D_xG_yD_yG_xR_x+\mathbb {E}D_xG_y(G_x+D_yG_x)D_yR_x) \,\lambda (dy) \, \lambda (dx), \end{aligned}$$

so that

$$\begin{aligned} U_2&\le \int \bigg |\mathbb {E}D_yG_x D_xG_y \int _0^1 \big (g'(X+sD_xX) - g'(X)\big )\,ds \bigg |\, \lambda ^2(d(x,y)) \\&\qquad + \int \bigg |\mathbb {E}D_xG_y(G_x + D_yG_x) D_y\bigg (\int _0^1 g'(X+sD_xX) \, ds\bigg )\bigg |\,\lambda ^2(d(x,y)) \\&=: U_{2,1} + U_{2,2}. \end{aligned}$$

Here, the expectations exist for \(\lambda ^2\)-a.e. (xy) because of \(|g'|\le 1\), (2.2) and (2.3). In view of the definition of \(T_4\), we can assume without loss of generality that

$$\begin{aligned} \mathbb {E}\int |D_yG_x D_xG_y G_x|\,\lambda ^2(d(x,y))<\infty . \end{aligned}$$
(4.9)

The commutation rule (3.11) leads to

$$\begin{aligned} U_{2,1}&= \int \bigg |\mathbb {E}D_yG_x D_xG_y D_xX \int _0^1\int _0^1 sg''(X+stD_xX) \, ds \, dt \bigg |\,\lambda ^2(d(x,y)) \\&\le \int \bigg | \mathbb {E}D_yG_x D_xG_y G_x \int _0^1\int _0^1 sg''(X+stD_xX)\,ds\,dt \bigg |\,\lambda ^2(d(x,y)) \\&\quad + \int \bigg |\mathbb {E}D_yG_x D_xG_y\varvec{\delta }(D_xG)\int _0^1\int _0^1 sg''(X+stD_xX) \,ds\,dt\bigg |\,\lambda ^2(d(x,y)). \end{aligned}$$

The following computation as well as (2.3) and (2.4) allow us to apply Lemma 3.2 to the second term on the right-hand side. From the commutation rule (3.11), the boundedness of \(g'\) and \(g''\), (4.9) and (2.3), we obtain

$$\begin{aligned}&\int \mathbb {E}\bigg |D_yG_x D_xG_y\varvec{\delta }(D_xG)\int _0^1\int _0^1 sg''(X+stD_xX) \,ds\,dt\bigg | \,\lambda ^2(d(x,y)) \\&\le \int \mathbb {E}|D_yG_x D_xG_y G_x| \,\lambda ^2(d(x,y)) \\&\quad + \int \mathbb {E}\bigg |D_yG_x D_xG_y D_xX \int _0^1\int _0^1 sg''(X+stD_xX) \,ds\,dt\bigg | \,\lambda ^2(d(x,y)) \\&\le \int \mathbb {E}|D_yG_x D_xG_y G_x| \,\lambda ^2(d(x,y)) \\&\quad + \int \mathbb {E}\bigg |D_yG_x D_xG_y \int _0^1 (g'(X+sD_xX) - g'(X)) \,ds\bigg | \,\lambda ^2(d(x,y)) < \infty . \end{aligned}$$

Thus, we derive from Lemma 3.2 and \(|g''|\le 2\) that

$$\begin{aligned} U_{2,1}&\le \mathbb {E}\int |D_yG_x D_xG_y G_x| \, \lambda ^2(d(x,y)) \\&\quad + \int \int \mathbb {E}\bigg |D_xG_z D_z \bigg (D_yG_x D_xG_y \int _0^1\int _0^1 sg''(X+stD_xX) \, ds \, dt\bigg ) \bigg | \, \lambda (dz)\,\lambda ^2(d(x,y)) \\&\le \mathbb {E}\int |D_yG_x D_xG_y G_x| \, \lambda ^2(d(x,y)) \\&\quad + \mathbb {E}\int |D_xG_z| \big (|D_z\big (D_yG_x D_xG_y\big )| + 2|D_yG_x D_xG_y| \big ) \, \lambda ^3(d(x,y,z)), \end{aligned}$$

where we used (3.3) in the last step. Similarly as in (4.1), we derive

$$\begin{aligned} D_y\bigg (\int _0^1&g'(X+sD_xX) \, ds\bigg ) \nonumber \\&= \int _0^1 (g'(X+sD_xX+D_yX+sD^2_{x,y}X) - g'(X+sD_xX)) \, ds \nonumber \\&= \int _0^1 \int _0^1 (D_yX+sD^2_{x,y}X) g''(X+sD_xX+ t (D_yX+sD^2_{x,y}X)) \, dt \, ds\nonumber \\&= \int _0^1 (D_yX+sD^2_{x,y}X) R(s,x,y) \, ds \end{aligned}$$
(4.10)

for \(x,y\in \mathbb {X}\) with

$$\begin{aligned} R(s,x,y){:}{=}\int _0^1 g''(X+sD_xX+ t (D_yX+sD^2_{x,y}X)) \, dt. \end{aligned}$$

By assumptions (2.2)-(2.5), we can use the commutation rule (3.11) twice to obtain that

$$\begin{aligned} D^2_{x,y}X=D_y(D_x\varvec{\delta }(G))=D_y(G_x+\varvec{\delta }(D_xG))=D_yG_x+D_xG_y+\varvec{\delta }(D^2_{x,y}G) \end{aligned}$$

a.s. and for \(\lambda ^2\)-a.e. (xy), while \(D_yX=G_y+\varvec{\delta }(D_yG)\) a.s. and for \(\lambda \)-a.e. y. Therefore, (4.10) equals

$$\begin{aligned} \int _0^1 (G_y+\varvec{\delta }(D_yG)+s (D_xG_y+D_yG_x + \varvec{\delta }(D^2_{x,y}G)))R(s,x,y)\,ds. \end{aligned}$$

For \(s\in [0,1]\), one has

$$\begin{aligned} \begin{aligned} \big |(G_y&+\varvec{\delta }(D_yG)+s (D_xG_y+D_yG_x + \varvec{\delta }(D^2_{x,y}G)))R(s,x,y)\big | \\&= \big | (D_yX+s D^2_{x,y}X) R(s,x,y) \big | \\&= \bigg | \int _0^1 (D_yX+s D^2_{x,y}X) g''(X+sD_xX+ t (D_yX+sD^2_{x,y}X)) \, dt \bigg | \\&= \big | g'(X+sD_xX+ D_yX+sD^2_{x,y}X) - g'(X+sD_xX) \big | \le 2, \end{aligned} \end{aligned}$$
(4.11)

whence

$$\begin{aligned}{} & {} \bigg | \int _0^1 (\varvec{\delta }(D_yG)+s \varvec{\delta }(D^2_{x,y}G)) R(s,x,y)\,ds \bigg | \\{} & {} \quad \le 2 + \bigg | \int _0^1 (G_y+s (D_xG_y+D_yG_x))R(s,x,y)\,ds \bigg |. \end{aligned}$$

Since \(|R(s,x,y)|\le 2\),

$$\begin{aligned} \bigg | \int _0^1 (G_y+s (D_xG_y+D_yG_x))R(s,x,y)\,ds \bigg | \le 2 |G_y| + |D_xG_y+D_yG_x|. \end{aligned}$$

Because of the assumption \(T_4<\infty \), this yields

$$\begin{aligned}{} & {} \int \bigg | \mathbb {E}D_xG_y (G_x+D_yG_x) \int _0^1 (G_y+s (D_xG_y+D_yG_x))R(s,x,y)\,ds \bigg | \, \lambda ^2(d(x,y)) \\{} & {} \quad \le \int \mathbb {E}|D_xG_y (G_x+D_yG_x)| (2 |G_y| + |D_xG_y+D_yG_x|) \, \lambda ^2(d(x,y))<\infty . \end{aligned}$$

Together with (2.2) and (2.3), we deduce from (4.11) that

$$\begin{aligned}{} & {} \mathbb {E}\int _0^1 \big | D_xG_y (G_x+D_yG_x) (\varvec{\delta }(D_yG)+s \varvec{\delta }(D^2_{x,y}G)) R(s,x,y) \big | \,ds \nonumber \\{} & {} \quad \le \mathbb {E}|D_xG_y (G_x+D_yG_x)| (2 + 2 |G_y| + |D_xG_y+D_yG_x|) <\infty \end{aligned}$$
(4.12)

for \(\lambda ^2\)-a.e. (xy). Hence, we have shown that

$$\begin{aligned} U_{2,2}&\le \mathbb {E}\int \big |(G_x + D_yG_x) D_xG_y\big | (2|G_y|+|D_xG_y+D_yG_x|) \, \lambda ^2(d(x,y)) \\&\quad + \int _0^1 \int \big |\mathbb {E}(G_x + D_yG_x) D_xG_y\varvec{\delta }(D_yG+sD^2_{x,y}G) R(s,x,y) \big | \, \lambda ^2(d(x,y))\, ds. \end{aligned}$$

By Lemma 3.2, which can be applied due to (4.12), the second term on the right-hand side can be further bounded by

$$\begin{aligned}&\int _0^1 \int \big |\mathbb {E}(D_yG_z+sD^2_{x,y}G_z) D_z((G_x + D_yG_x) D_xG_y R(s,x,y)) \big |\,\lambda ^3(d(x,y,z))\, ds\\&\quad \le 2\,\mathbb {E}\int (|D_yG_z|+|D^2_{x,y}G_z|) \big (|D_z\big ((G_x + D_yG_x) D_xG_y\big )| + 2 |(G_x + D_yG_x) D_xG_y| \big ) \\&\qquad \qquad \qquad \times \lambda ^3(d(x,y,z)). \end{aligned}$$

Combining the previous bounds, we see that

$$\begin{aligned} U_1+U_2 \le&\mathbb {E}\int |G_x|^3\, \lambda (dx) + \mathbb {E}\int \big ( 2 |D_xG_y D_yG_x G_x| + |D_xG_y (D_yG_x)^2| + 2G_x^2 |D_xG_y| \\&\quad + \big | (G_x + D_yG_x) D_xG_y\big | (2|G_y|+|D_xG_y+D_yG_x|) \\&\quad + |D_yG_x D_xG_y G_x| \big )\, \lambda ^2(d(x,y)) \\&\quad + \mathbb {E}\int 2(|D_yG_z|+|D^2_{x,y}G_z|) \big (|D_z\big ((G_x + D_yG_x) D_xG_y\big )| \\&\quad + 2|(G_x + D_yG_x) D_xG_y| \big ) \\&\quad + |D_xG_z| \big (|D_z\big (D_yG_x D_xG_y\big )| + 2|D_yG_x D_xG_y| \big )\, \lambda ^3(d(x,y,z)) \\ =&T_3 + T_4 + T_5, \end{aligned}$$

which together with (4.5) completes the proof.

5 Proof for the Kolmogorov Distance in Theorem 2.1

We prepare the proof of the second part of Theorem 2.1 by two lemmas. Since we consider iterated KS-integrals in the following, we indicate the integration variable as a subscript, i.e. write \(\varvec{\delta }_x\) to denote the KS-integral with respect to x.

Lemma 1.9

Let \(h:{\textbf{N}}\times \mathbb {X}^2\rightarrow \mathbb {R}\) be measurable and such that

$$\begin{aligned} \mathbb {E}\int h(x,y)^2 \, \lambda ^2(d(x,y)) + \mathbb {E}\int (D_zh(x,y))^2 \, \lambda ^3(d(x,y,z)) \nonumber \\ + \mathbb {E}\int (D^2_{z,w}h(x,y))^2 \, \lambda ^4(d(x,y,z,w)) <\infty . \end{aligned}$$
(5.1)
  1. (i)

    Then, \(\varvec{\delta }_x(\varvec{\delta }_y(h(x,y)))\) is well defined and

    $$\begin{aligned} \mathbb {E}\big [ \varvec{\delta }_x(\varvec{\delta }_y(h(x,y)))^2 \big ]\le & {} 3 \mathbb {E}\int h(x,y)^2 \, \lambda ^2(d(x,y)) \\{} & {} \quad + 3 \mathbb {E}\int \big (D_zh(x,y)\big )^2 \, \lambda ^3(d(x,y,z)) \\{} & {} \quad + 2\mathbb {E}\int \big (D^2_{w,z}h(x,y)\big )^2 \, \lambda ^4(d(x,y,z,w)). \end{aligned}$$
  2. (ii)

    If \(H\in L^2(\mathbb {P}_\eta )\) is such that \(D_xH\in L^2(\mathbb {P}_\eta )\) for \(\lambda \)-a.e. x, \(D^2_{x,y}H\in L^2(\mathbb {P}_\eta )\) for \(\lambda ^2\)-a.e. (xy) and

    $$\begin{aligned} \mathbb {E}\int |D^2_{x,y}H h(x,y)| \, \lambda ^2(d(x,y))<\infty , \end{aligned}$$
    (5.2)

    then

    $$\begin{aligned} \mathbb {E}\int D^2_{x,y}H h(x,y) \, \lambda ^2(d(x,y)) = \mathbb {E}\big [ \varvec{\delta }_x(\varvec{\delta }_y(h(x,y))) H \big ]. \end{aligned}$$

Proof

First, let us assume that all KS-integrals are well defined. By applying iteratively [13, Corollary 2.4] and (3.11), we have

$$\begin{aligned}&\mathbb {E}\big [ \varvec{\delta }_x(\varvec{\delta }_y(h(x,y)))^2 \big ] \\&\le \mathbb {E}\int \varvec{\delta }_y(h(x,y))^2 \, \lambda (dx) + \mathbb {E}\int (D_z\varvec{\delta }_y(h(x,y)))^2 \, \lambda ^2(d(x,z)) \\&\le \mathbb {E}\int \varvec{\delta }_y(h(x,y))^2 \, \lambda (dx) + 2\mathbb {E}\int h(x,z)^2 \, \lambda ^2(d(x,z)) + 2\mathbb {E}\int \varvec{\delta }_y(D_zh(x,y))^2 \, \lambda ^2(d(x,z)) \\&\le \mathbb {E}\int h(x,y)^2 \, \lambda ^2(d(x,y)) + \mathbb {E}\int \big (D_zh(x,y)\big )^2 \, \lambda ^3(d(x,y,z)) + 2\mathbb {E}\int h(x,z)^2 \, \lambda ^2(d(x,z)) \\&\quad + 2\mathbb {E}\int \big (D_zh(x,y)\big )^2 \, \lambda ^3(d(x,y,z)) + 2\mathbb {E}\int \big (D^2_{w,z}h(x,y)\big )^2 \, \lambda ^4(d(x,y,z,w)) \\&= 3 \mathbb {E}\int h(x,y)^2 \, \lambda ^2(d(x,y)) + 3 \mathbb {E}\int \big (D_zh(x,y)\big )^2 \, \lambda ^3(d(x,y,z)) \\&\quad + 2\mathbb {E}\int \big (D^2_{w,z}h(x,y)\big )^2 \, \lambda ^4(d(x,y,z,w)). \end{aligned}$$

Since, by (5.1), the right-hand side is finite, all involved KS-integrals are well defined by [13, Proposition 2.3].

Because of (5.2) and Fubini’s theorem, we have

$$\begin{aligned} J{:}{=}\mathbb {E}\int D^2_{x,y}H h(x,y) \, \lambda ^2(d(x,y)) = \int \int \mathbb {E}D^2_{x,y}H h(x,y) \, \lambda (dy) \, \lambda (dx). \end{aligned}$$

For \(\lambda \)-a.e. x, our assumptions imply \(D_xH\in L^2(\mathbb {P}_\eta )\), \(D^2_{x,y}H\in L^2(\mathbb {P}_\eta )\) for \(\lambda \)-a.e. y as well as

$$\begin{aligned} \mathbb {E}\int h(x,y)^2 \, \lambda (dy)<\infty \quad \text {and} \quad \mathbb {E}\int (D_zh(x,y))^2 \, \lambda ^2(d(y,z))<\infty . \end{aligned}$$

Thus, it follows from Lemma 3.1 that

$$\begin{aligned} J = \int \mathbb {E}D_xH \varvec{\delta }_y(h(x,y)) \, \lambda (dx). \end{aligned}$$

Since \(H\in L^2(\mathbb {P}_\eta )\), \(D_xH\in L^2(\mathbb {P}_\eta )\) for \(\lambda \)-a.e. x and combining (5.1) and [13, Corollary 2.4] as in the proof of part (i) yields

$$\begin{aligned} \mathbb {E}\int \varvec{\delta }_y(h(x,y))^2 \, \lambda (dx)<\infty \quad \text {and} \quad \mathbb {E}\int (D_z\varvec{\delta }_y(h(x,y)))^2 \, \lambda ^2(d(x,z))<\infty , \end{aligned}$$

a further application of Lemma 3.1 leads to

$$\begin{aligned} J = \mathbb {E}H \varvec{\delta }_x(\varvec{\delta }_y(h(x,y))), \end{aligned}$$

which concludes the proof of part (ii). \(\square \)

For \(a\in \mathbb {R}\), let \(f_a\) be a solution of the Stein equation

$$\begin{aligned} f_a'(u) - u f_a(u) = {\textbf{1}}\{u\le a\} - \Phi (a), \quad u\in \mathbb {R}, \end{aligned}$$
(5.3)

where \(\Phi \) is the distribution function of the standard normal distribution. Note that \(f_a\) is continuously differentiable on \(\mathbb {R}\setminus \{a\}\). Thus, we use the convention that \(f_a'(a)\) is the left-sided limit of \(f_a'\) in a. For the following lemma, we refer the reader to [4, Lemma 2.2 and Lemma 2.3].

Lemma 1.10

For each \(a\in \mathbb {R}\), there exists a unique bounded solution \(f_a\) of (5.3). This function satisfies:

  1. (i)

    \(u\mapsto u f_a(u)\) is non-decreasing;

  2. (ii)

    \(|uf_a(u)|\le 1\) for all \(u\in \mathbb {R}\);

  3. (iii)

    \(|f_a'(u)|\le 1\) for all \(u\in \mathbb {R}\).

Now we are ready for the proof for the Kolmogorov distance. It combines the approach for the Wasserstein distance with arguments from [8], which refined ideas previously used in [5] and [22]. Indeed, for the normal approximation of Poisson functionals in Kolmogorov distance the Malliavin–Stein method was first used in [22]. One of the terms in the bound was removed in [5] and two more in [8]. The innovation of [8], which was inspired by the proof of Theorem 2.2 in [24] and which we also employ in the following, is to exploit the monotonicity of \(u\mapsto u f_a(u)\) and \(u\mapsto {\textbf{1}}\{u\le a\}\).

Proof for the Kolmogorov distance in Theorem 2.1

Throughout the proof, we can assume without loss of generality that \(T_1,T_2,T_6,T_7,T_8,T_9<\infty \). Let \(a\in \mathbb {R}\), and let \(f_a\) be the solution of (5.3) from Lemma 5.2. For \(X{:}{=}\varvec{\delta }(G)\), we have \(f_a(X)\in {\text {dom}} D\) (since \(|f'_a|\le 1\) and \(X\in {\text {dom}} D\)), whence the integration by parts rule (3.7) yields similarly as in (4.2) that

$$\begin{aligned} \mathbb {E}\big [f_a'(X) - X f_a(X)\big ] = \mathbb {E}\Big [ f_a'(X) - \int G_x D_xf_a(X) \, \lambda (dx)\Big ]. \end{aligned}$$

Together with

$$\begin{aligned} D_xf_a(X) = f_a(X+D_xX) - f_a(X) = \int _0^{D_xX} f_a'(X+s) \, ds, \end{aligned}$$

we obtain

$$\begin{aligned} \mathbb {E}\big [f_a'(X) - X f_a(X)\big ]&= \mathbb {E}f_a'(X) \Big ( 1- \int G_x D_xX \, \lambda (dx) \Big ) \\&\qquad - \mathbb {E}\int \int _0^{D_xX} \big (f_a'(X+s) - f_a'(X)\big ) \, ds \ G_x \, \lambda (dx) \\&=: I_1-I_2, \end{aligned}$$

where the decomposition into \(I_1\) and \(I_2\) is allowed due to \(|f_a'|\le 1\) and (4.3). The commutation rule (3.11) yields

$$\begin{aligned} I_1 = \mathbb {E}f_a'(X) \Big ( 1 - \int G_x^2 \, \lambda (dx) - \int G_x \varvec{\delta }(D_xG) \, \lambda (dx) \Big ). \end{aligned}$$

From Fubini’s theorem, which is applicable because of \(|f_a'|\le 1\) and (4.3), and Lemma 3.1 it follows that

$$\begin{aligned} \mathbb {E}f_a'(X) \int G_x \varvec{\delta }(D_xG) \, \lambda (dx)&= \int \mathbb {E}f_a'(X) G_x \varvec{\delta }(D_xG) \, \lambda (dx) \\&= \int \int \mathbb {E}D_xG_y D_y\big (f_a'(X) G_x\big ) \, \lambda (dy) \, \lambda (dx). \end{aligned}$$

The use of Lemma 3.1 is justified by \(f_a'(X)G_x\in L^2(\mathbb {P}_\eta )\) for \(\lambda \)-a.e. x and \(D_y(f_a'(X)G_x)\in L^2(\mathbb {P}_\eta )\) for \(\lambda ^2\)-a.e. (xy), which are consequences of \(|f_a'|\le 1\), (2.2) and (2.3), as well as (2.3) and (2.4). From (3.3), we derive

$$\begin{aligned} D_y\big (f_a'(X) G_x\big ) = f_a'(X) D_yG_x + D_yf_a'(X) (G_x+D_yG_x). \end{aligned}$$

Combining this with \(|f_a'|\le 1\), (2.3) and (2.7), we see that

$$\begin{aligned}{} & {} \int \int \mathbb {E}\big |D_xG_y D_y\big (f_a'(X) G_x\big )\big | \, \lambda (dy) \, \lambda (dx) \nonumber \\{} & {} \le \int \mathbb {E}\big |f_a'(X) D_xG_y D_yG_x \big | \, \lambda ^2(d(x,y)) + \int \mathbb {E}\big | D_yf_a'(X) D_xG_y (G_x+D_yG_x) \big | \, \lambda ^2(d(x,y)) \nonumber \\{} & {} \le \mathbb {E}\int |D_xG_y D_yG_x| \, \lambda ^2(d(x,y)) + 2 \mathbb {E}\int (|D_xG_y G_x| + |D_xG_y D_yG_x|) \, \lambda ^2(d(x,y)) < \infty .\nonumber \\ \end{aligned}$$
(5.4)

By Fubini’s theorem, this makes it possible to rewrite \(I_1\) as

$$\begin{aligned} I_1&= \mathbb {E}f_a'(X) \big ( 1 - \int G_x^2 \, \lambda (dx) - \int D_yG_x D_xG_y \, \lambda ^2(d(x,y)) \big ) \\&\quad - \mathbb {E}\int D_yf_a'(X) (G_x+D_yG_x) D_xG_y \, \lambda ^2(d(x,y)) = : I_{1,1} - I_{1,2}. \end{aligned}$$

It follows, as in the proof for the Wasserstein distance, that

$$\begin{aligned} |I_{1,1}| \le T_1+T_2. \end{aligned}$$

As shown in (5.4), we can apply Fubini’s theorem to \(I_{1,2}\), so that

$$\begin{aligned} I_{1,2} = \int \mathbb {E}D_yf_a'(X) \int (G_x+D_yG_x) D_xG_y \, \lambda (dx) \, \lambda (dy). \end{aligned}$$

The boundedness of \(f_a'\) implies that \(|f_a'(X)| \le 1\) and \(|D_yf_a'(X)|\le 2\) for \(\lambda \)-a.e. y, while \(y\mapsto \int (G_x+D_yG_x) D_xG_y \, \lambda (dx)\) satisfies (2.2) and (2.3) because of \(T_6<\infty \). Thus, Lemma 3.1 shows that

$$\begin{aligned} I_{1,2} = \mathbb {E}f_a'(X) \varvec{\delta }_y\Big ( \int (G_x+D_yG_x) D_xG_y \, \lambda (dx)\Big ). \end{aligned}$$

Together with \(|f_a'|\le 1\) and Jensen’s inequality, we obtain that

$$\begin{aligned} |I_{1,2}|&\le \mathbb {E}\big |\varvec{\delta }_y\Big ( \int (G_x+D_yG_x) D_xG_y \, \lambda (dx)\Big )\big | \\&\le \bigg (\mathbb {E}\varvec{\delta }_y\Big ( \int (G_x+D_yG_x) D_xG_y \, \lambda (dx)\Big )^2 \bigg )^{1/2}. \end{aligned}$$

It follows from [13, Corollary 2.4] that

$$\begin{aligned} \mathbb {E}&\varvec{\delta }_y\Big ( \int (G_x+D_yG_x) D_xG_y \, \lambda (dx)\Big )^2 \\&\le \mathbb {E}\int \bigg ( \int (G_x+D_yG_x) D_xG_y \, \lambda (dx)\bigg )^2 \, \lambda (dy) \\&\qquad + \mathbb {E}\int \bigg ( \int D_z \big ((G_x+D_yG_x) D_xG_y\big ) \, \lambda (dx)\bigg )^2 \, \lambda ^2(d(y,z)) = T_6^2. \end{aligned}$$

In the sequel, we focus on \(I_2\). By (5.3), the inner integral in \(I_2\) equals

$$\begin{aligned} \int _0^{D_xX} \Big ((X+s) f_a(X+s) - X f_a(X) + {\textbf{1}}\{ X+s \le a\} - {\textbf{1}}\{ X \le a\} \Big ) \, ds. \end{aligned}$$

Since \(u \mapsto u f_a(u)\) is non-decreasing (see Lemma 5.2 (i)) and \(u\mapsto {\textbf{1}}\{u\le a\}\) is non-increasing, we derive by considering the cases \(D_xX\ge 0\) and \(D_xX<0\) separately that

$$\begin{aligned} \bigg | \int _0^{D_xX} \Big ((X+s) f_a(X+s) - X f_a(X)\Big ) \, ds \bigg |&\le D_xX \Big ((X+D_xX) f_a(X+D_xX) - X f_a(X)\Big ) \\&= D_xX D_x(Xf_a(X)) \end{aligned}$$

and

$$\begin{aligned} \bigg | \int _0^{D_xX} \Big ({\textbf{1}}\{ X+s \le a\} - {\textbf{1}}\{ X \le a\} \Big ) \, ds \bigg |&\le -D_xX \Big ({\textbf{1}}\{ X+D_xX \le a\} - {\textbf{1}}\{ X \le a\} \Big )\\&= -D_xX D_x{\textbf{1}}\{ X \le a\}. \end{aligned}$$

Combining these estimates with (3.11) leads to

$$\begin{aligned} |I_2|&\le \mathbb {E}\int D_xX D_x\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) |G_x| \, \lambda (dx) \\&= \mathbb {E}\int D_x(Xf_a(X) - {\textbf{1}}\{X\le a\}) G_x |G_x| \, \lambda (dx) \\&\quad + \mathbb {E}\int \varvec{\delta }(D_xG) D_x \big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) |G_x| \, \lambda (dx) =: I_{2,1} + I_{2,2}. \end{aligned}$$

The decomposition into two integrals on the right-hand side is allowed as can be seen from the following argument. From Lemma 5.2 (ii), we know that

$$\begin{aligned} |uf_a(u)-{\textbf{1}}\{u\le a\}|\le 2 \quad \text {for all} \quad u\in \mathbb {R}. \end{aligned}$$
(5.5)

Together with (2.2), we see that

$$\begin{aligned} \mathbb {E}\int \big |D_x(Xf_a(X) - {\textbf{1}}\{X\le a\}) G_x |G_x|\big | \, \lambda (dx) \le 4 \mathbb {E}\int G_x^2 \, \lambda (dx) <\infty . \end{aligned}$$

It follows from (5.5), the Cauchy–Schwarz inequality [13, Corollary 2.4] and (2.2)–(2.4) that

$$\begin{aligned}&\mathbb {E}\int \big |\varvec{\delta }(D_xG) D_x \big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) |G_x|\big | \, \lambda (dx) \le 4 \mathbb {E}\int |\varvec{\delta }(D_xG) G_x| \, \lambda (dx) \\&\le 4 \bigg (\mathbb {E}\int \varvec{\delta }(D_xG)^2 \, \lambda (dx) \bigg )^{1/2} \bigg (\mathbb {E}\int G_x^2 \, \lambda (dx) \bigg )^{1/2} \\&\le 4 \bigg (\mathbb {E}\int (D_xG_y)^2 \, \lambda ^2(d(x,y)) + \mathbb {E}\int (D^2_{x,z}G_y)^2 \, \lambda ^3(d(x,y,z)) \bigg )^{1/2} \bigg ( \mathbb {E}\int G_x^2 \, \lambda (dx) \bigg )^{1/2}\\&<\infty . \end{aligned}$$

Thus, the integrals \(I_{2,1}\) and \(I_{2,2}\) are well defined and finite. Moreover, we can interchange expectation and integration in \(I_{2,1}\) and \(I_{2,2}\) by Fubini’s theorem.

We deduce from (5.5) for \(Z{:}{=}Xf_a(X) - {\textbf{1}}\{X\le a\}\) that

$$\begin{aligned} |Z| \le 2, \quad |D_xZ|\le 4 \quad \text {for} \quad \lambda \text {-a.e.}\ x \quad \text {and} \quad |D^2_{x,y}Z|\le 8 \quad \text {for} \quad \lambda ^2\text {-a.e.} \ (x,y).\nonumber \\ \end{aligned}$$
(5.6)

Note that \(\mathbb {E}\int G_x^4 \, \lambda (dx) <\infty \) since \(T_7<\infty \). Together with (2.8), we see that \(\mathbb {X}\ni x\mapsto G_x|G_x|\) satisfies the integrability conditions (2.2) and (2.3) and that \(G|G|\in {\text {dom}}\varvec{\delta }\). Thus, Lemma 3.1 with G replaced by G|G| implies

$$\begin{aligned} I_{2,1} = \mathbb {E}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) \varvec{\delta }(G |G|). \end{aligned}$$

Since \( D_x\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big )|G_x|\in L^2(\mathbb {P}_\eta )\) for \(\lambda \)-a.e. x and \(D_y( D_x\big (Xf_a(X) - {{\textbf{1}}}\{X\le a\}\big )|G_x|)\in L^2(\mathbb {P}_\eta )\) for \(\lambda ^2\)-a.e. (xy), Lemma 3.1 and the product rule (3.3) yield

$$\begin{aligned} I_{2,2}&= \mathbb {E}\int D_xG_y D^2_{x,y}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) (D_y|G_x|+|G_x|) \, \lambda ^2(d(x,y)) \\&\qquad + \mathbb {E}\int D_xG_y D_x\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) D_y|G_x| \, \lambda ^2(d(x,y)). \end{aligned}$$

The decomposition of \(I_{2,2}\) into two integrals is justified since it follows from (5.5), (2.3) and (2.7) that

$$\begin{aligned} \begin{aligned}&\mathbb {E}\int |D_xG_y D^2_{x,y}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) (D_y|G_x|+|G_x|)| \, \lambda ^2(d(x,y)) \\&\le 8 \mathbb {E}\int |D_xG_y G_x| + (D_xG_y)^2 \, \lambda ^2(d(x,y)) < \infty \end{aligned} \end{aligned}$$
(5.7)

and

$$\begin{aligned} \begin{aligned}&\mathbb {E}\int |D_xG_y D_x\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) D_y|G_x|| \, \lambda ^2(d(x,y)) \\&\le 4 \mathbb {E}\int (D_xG_y)^2 \, \lambda ^2(d(x,y)) < \infty . \end{aligned} \end{aligned}$$
(5.8)

Note that \(h(x,y){:}{=}D_xG_y (D_y|G_x|+|G_x|)\) satisfies (5.1) because of \(T_9<\infty \), so that \(\varvec{\delta }_x(\varvec{\delta }_y(h(x,y)))\) is well defined by Lemma 5.1 (i). Together with (5.6) and (5.7), it follows from Lemma 5.1 (ii) that

$$\begin{aligned} \mathbb {E}\int D_xG_y D^2_{x,y}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) (D_y|G_x|+|G_x|) \, \lambda ^2(d(x,y)) \\ = \mathbb {E}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) \varvec{\delta }_x\big (\varvec{\delta }_y(D_xG_y (D_y|G_x|+|G_x|))\big ). \end{aligned}$$

Because of \(T_8<\infty \), we see that

$$\begin{aligned} \mathbb {E}\int \Big ( \int D_xG_y D_y|G_x| \, \lambda (dy) \Big )^2 \, \lambda (dx)<\infty \end{aligned}$$

and recall (2.9), whence \(\mathbb {X}\ni x\mapsto \int D_xG_y D_y|G_x| \, \lambda (dy)\) satisfies the integrability assumptions (2.2) and (2.3) and belongs to \({\text {dom}}\varvec{\delta }\). By (5.6), (5.8), Fubini’s theorem and Lemma 3.1,

$$\begin{aligned} \mathbb {E}\int&D_xG_y D_x\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) D_y|G_x| \, \lambda ^2(d(x,y)) \\&= \int \mathbb {E}D_x\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) \int D_xG_y D_y|G_x| \, \lambda (dy) \, \lambda (dx) \\&= \mathbb {E}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) \varvec{\delta }_x\Big ( \int D_xG_y D_y|G_x| \, \lambda (dy) \Big ). \end{aligned}$$

We have shown that

$$\begin{aligned} I_{2,2}&= \mathbb {E}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) \varvec{\delta }_x\big (\varvec{\delta }_y(D_xG_y (D_y|G_x|+|G_x|))\big ) \\&\qquad + \mathbb {E}\big (Xf_a(X) - {\textbf{1}}\{X\le a\}\big ) \varvec{\delta }_x\Big ( \int D_xG_y D_y|G_x| \, \lambda (dy) \Big ). \end{aligned}$$

Now (5.5) and Jensen’s inequality yield that

$$\begin{aligned} |I_{2,1}| \le 2 \mathbb {E}|\varvec{\delta }(G |G|)| \le 2 \sqrt{ \mathbb {E}\varvec{\delta }(G |G|)^2 } \end{aligned}$$

and that

$$\begin{aligned} |I_{2,2}|&\le 2 \bigg (\mathbb {E}\varvec{\delta }_x\big (\varvec{\delta }_y(D_xG_y (D_y|G_x|+|G_x|))\big )^2 \bigg )^{1/2} \\&\qquad + 2\bigg (\mathbb {E}\varvec{\delta }_x\Big ( \int D_xG_y D_y|G_x| \, \lambda (dy) \Big )^2 \bigg )^{1/2}. \end{aligned}$$

By (3.6), we have

$$\begin{aligned} \mathbb {E}\varvec{\delta }(G |G|)^2 = \mathbb {E}\int G_x^4 \, \lambda (dx) + \mathbb {E}\int D_x(G_y|G_y|) D_y(G_x|G_x|) \, \lambda ^2(d(x,y)) = T_7^2 \end{aligned}$$

and

$$\begin{aligned} \mathbb {E}&\varvec{\delta }_x\Big ( \int D_xG_y D_y|G_x| \,\lambda (dy) \Big )^2 \\&\le \mathbb {E}\int \bigg ( \int D_xG_y D_y|G_x| \, \lambda (dy) \bigg )^2 \, \lambda (dx) \\&\quad + \mathbb {E}\int D_x\bigg ( \int D_zG_y D_y|G_z| \, \lambda (dy) \bigg ) D_z\bigg ( \int D_xG_y D_y|G_x| \, \lambda (dy) \, \bigg ) \, \lambda ^2(d(x,z)) \\&= T_8^2. \end{aligned}$$

From Lemma 5.1 (i), whose assumptions are satisfied due to \(T_9<\infty \), it follows that

$$\begin{aligned}&\mathbb {E}\varvec{\delta }_x\big (\varvec{\delta }_y(D_xG_y (D_y|G_x|+|G_x|))\big )^2 \\&\quad \le 3\mathbb {E}\int (D_xG_y)^2 (D_y|G_x|+|G_x|)^2\, \lambda ^2(d(x,y)) \\&\qquad + 3 \mathbb {E}\int \big ( D_z\big ( D_xG_y (D_y|G_x|+|G_x|) \big ) \big )^2 \, \lambda ^3(d(x,y,z)) \\&\qquad +2 \mathbb {E}\int \big ( D^2_{z,w}\big ( D_xG_y (D_y|G_x|+|G_x|) \big ) \big )^2 \,\lambda ^4(d(x,y,z,w)) = T_9^2, \end{aligned}$$

which completes the proof. \(\square \)

6 Poisson Embedding

In this section, we consider a Poisson process \(\eta \) on \(\mathbb {X}{:}{=}\mathbb {R}^d\times \mathbb {R}_+\), whose intensity measure \(\lambda \) is the product of the Lebesgue measure \(\lambda _d\) on \(\mathbb {R}^d\) and the Lebesgue measure \(\lambda _+\) on \(\mathbb {R}_+\). We fix a measurable mapping \(\varphi :\mathbb {R}^d\times \textbf{N}\rightarrow [0,\infty ]\), where the value \(\infty \) is allowed for technical convenience. Then

$$\begin{aligned} \xi {:}{=}\int {{\textbf{1}}}\{s\in \cdot \} {{\textbf{1}}}\{x\le \varphi (s,\eta -\delta _{(s,x)})\}\,\eta (d(s,x)) \end{aligned}$$
(6.1)

is a point process on \(\mathbb {R}^d\). (At this stage, it might not be locally finite.) Let \(u:\mathbb {R}^d\rightarrow \mathbb {R}\) be a measurable function, and define \(G:\textbf{N}\times \mathbb {X}\rightarrow \mathbb {R}\) by

$$\begin{aligned} G_{(s,x)}(\mu ){:}{=}u(s){{\textbf{1}}}\{x\le \varphi (s,\mu )\}, \quad (\mu ,(s,x))\in \textbf{N}\times \mathbb {X}. \end{aligned}$$

Under suitable integrability assumptions, we then have

$$\begin{aligned} \varvec{\delta }(G)=\int u(s){{\textbf{1}}}\{x\le \varphi (s,\eta -\delta _{(s,x)})\}\,\eta (d(s,x)) -\int u(s){{\textbf{1}}}\{x\le \varphi (s,\eta )\}\,\lambda (d(s,x)), \end{aligned}$$

that is,

$$\begin{aligned} \varvec{\delta }(G)=\int u(s)\,\xi (ds)-\int u(s)\varphi (s,\eta )\,ds. \end{aligned}$$

This can be interpreted as integral of u with respect to the compensated point process \(\xi \). To make the dependence on u more visible, we abuse our notation and write \(\varvec{\delta }(u){:}{=}\varvec{\delta }(G)\), whenever this integral is defined pathwise.

Under certain assumptions, it can be expected that the standardised \(\varvec{\delta }(u)\) is getting close to a normal distribution. To establish an asymptotic scenario, we take a Borel set \(B\subset \mathbb {R}^d\) with \(\lambda _d(B)<\infty \) and define the function \(u_B:\mathbb {R}^d\rightarrow \mathbb {R}\) by \(u_B(s){:}{=}{{\textbf{1}}}\{s\in B\}u(s)\). Then \(\varvec{\delta }(u_B)\) is the KS-integral of the function \(G_B:\textbf{N}\times \mathbb {X}\rightarrow \mathbb {R}\), defined by \(G_B(\mu ,s,x){:}{=}u_B(s){{\textbf{1}}}\{x\le \varphi (s,\mu )\}\). We are interested in the normal approximation of \(\varvec{\delta }(u_B)\) for B of growing volume.

Remark 6.1

Assume that \(d=1\) and that \(\varphi \) is predictable, that is, \(\varphi (t,\mu )=\varphi (t,\mu _{t-})\), where \(\mu _{t-}\) is the restriction of \(\mu \in \textbf{N}\) to \((-\infty ,t)\times \mathbb {R}_+\). Then, under suitable integrability assumptions (satisfied under our assumptions below) \(\big (\xi ([0,t])-\int ^t_0 \varphi (s,\eta )\,ds\big )_{t\ge 0}\) is a martingale with respect to the filtration \((\sigma (\eta _{(-\infty ,t]\times \mathbb {R}_+}))_{t\ge 0}\); see, for example, [11]. Therefore, \((\varphi (t,\cdot ))_{t\ge 0}\) is a stochastic intensity of \(\xi \) (on \(\mathbb {R}_+\)) with respect to this filtration. Take \(B=[0,T]\) for some \(T>0\) and write \(u_T{:}{=}u_B\). Then \((\varvec{\delta }(u_T))_{T\ge 0}\) is a martingale. Theorem 3.1 from [25] provides a quantitative central limit theorem in the Wasserstein distance for \(\varvec{\delta }(u_T)\). Below we derive a similar result using our tools, not only for the Wasserstein but also for the Kolmogorov distance. It should be noted that predictability and martingale properties are of no relevance for our approach. All what matters is that \(\varvec{\delta }(u_B)\) is a KS-integral with respect to the Poisson process \(\eta \).

Before stating some assumptions on \(\varphi \), we introduce some useful terminology. A mapping Z from \(\textbf{N}\) to the Borel sets of \(\mathbb {X}\) is called graph-measurable if \((\mu ,s,x)\mapsto {{\textbf{1}}}\{(s,x)\in Z(\mu )\}\) is a measurable mapping. Given such a mapping, we define a whole family of \(Z_t\), \(t\in \mathbb {R}^d\), of such mappings by setting

$$\begin{aligned} Z_t(\mu ){:}{=}Z(\theta _t\mu )+t, \end{aligned}$$

where \(\theta _t\mu {:}{=}\int {{\textbf{1}}}\{(r-t,z)\in \cdot \}\,\mu (d(r,z))\) is the shift of \(\mu \) by t in the first coordinate and \(A+t{:}{=}\{(s+t,x):(s,x)\in A\}\) for any \(A\subset \mathbb {R}^d\times \mathbb {R}_+\).

We assume that there exists a graph-measurable Z such that

$$\begin{aligned} \varphi (t,\mu +\mu ')=\varphi (t,(\mu +\mu ')_{Z_t(\mu )}), \quad (t,\mu ,\mu ')\in \mathbb {R}^d\times \textbf{N}\times \textbf{N},\, \mu '(\mathbb {X})\le 3. \end{aligned}$$
(6.2)

Here, we denote by \(\nu _A\) the restriction of a measure \(\nu \) to a Borel set A of \(\mathbb {X}\). Next, we assume that there exists a measurable mapping \(Y:\textbf{N}\rightarrow \mathbb {R}_+\) such that

$$\begin{aligned} \varphi (t,\mu +\mu ')\le Y(\theta _t\mu ), \quad (t,\mu ,\mu ')\in \mathbb {R}^d\times \textbf{N}\times \textbf{N},\, \mu '(\mathbb {X})\le 3. \end{aligned}$$
(6.3)

We let \(Y_t(\eta )=Y(\theta _t\eta )\) for \(t\in \mathbb {R}^d\). As in the rest of the paper, we write \(Z_t\), \(Y_t\) and \(\varphi _t\) instead of \(Z_t(\eta )\), \(Y_t(\eta )\) and \(\varphi _t(\eta )\) for \(t\in \mathbb {R}^d\). Finally, we need the following integrability assumptions:

$$\begin{aligned}&\int _{\mathbb {R}^d} \big (\mathbb {E}\lambda (Z_0\cap Z_s)^4\big )^{1/4}\,ds<\infty , \end{aligned}$$
(6.4)
$$\begin{aligned}&\int _{\mathbb {R}_+}\int _{\mathbb {R}^d} \mathbb {P}((s,x)\in Z_0)^{1/4}\,ds\,dx<\infty , \end{aligned}$$
(6.5)
$$\begin{aligned}&\int _{\mathbb {R}_+}\int _{\mathbb {R}_+}\int _{\mathbb {R}^d} \mathbb {P}((s,x)\in Z_0,(0,y)\in Z_s)^{1/3}\,ds\,dx\,dy<\infty , \end{aligned}$$
(6.6)
$$\begin{aligned}&\mathbb {E}Y_0^4<\infty . \end{aligned}$$
(6.7)

It follows from Fubini’s theorem, Hölder’s inequality and (6.5) that \(\mathbb {E}\lambda (Z_0)^4<\infty \).

Assumptions (6.3) and (6.7) justify that \(\varvec{\delta }(u_B)\) is defined pathwise if u is bounded. Moreover, we will see below that our assumptions imply that (2.2) and (2.3) hold. Therefore, \(G_B\) is in the domain of the KS-integral.

Next we illustrate (6.2) and (6.4)–(6.6) with a simple example. Further examples will be discussed later in the section.

Example 6.2

Assume that \(d=1\). A simple (deterministic) choice of the sets \(Z_t\) is \(Z_t{:}{=}[t-h,t)\times C\), where \(h>0\) and \(C\subset \mathbb {R}_+\) is a bounded Borel set. If we assume that \(\varphi (t,\mu )=\varphi (t,\mu _{Z_t})\) for all \((t,\mu )\), then (6.2) holds, while (6.4)–(6.6) are trivially true. To discuss another, less trivial, choice, we fix another Borel set \(C'\subset \mathbb {R}_+\) with \(0<\lambda _+(C')<\infty \) and \(n\in \mathbb {N}\). For \(\mu \in \textbf{N}\) and \(t\in \mathbb {R}\), let \(T^t_n(\mu )\) denote the nth point of \(\mu (\cdot \times C')\) strictly before \(t\in \mathbb {R}\). Define \(Z_t(\mu ){:}{=}[T^t_n(\mu ),t)\times C\). Then \(Z_t(\mu )=Z(\theta _t\mu )+t\), and we have

$$\begin{aligned} Z_t(\mu +\mu ')=Z_t((\mu +\mu ')_{Z_t(\mu )})\;\text {and}\; Z_t(\mu +\mu ')\subset Z_t(\mu ), (t,\mu ,\mu ')\in \mathbb {R}_+\times \textbf{N}\times \textbf{N}. \end{aligned}$$
(6.8)

Assuming again that \(\varphi (t,\mu )=\varphi (t,\mu _{Z_t})\), we easily obtain (6.2). It is straightforward to check that (6.4)–(6.6) hold.

For the normal approximation of \(\varvec{\delta }(u_B)\), we have the following result.

Theorem 1.13

Let \(\varphi :\mathbb {R}^d\times \textbf{N}\rightarrow [0,\infty ]\) be measurable, and let Z be a graph-measurable mapping from \(\textbf{N}\) to the Borel sets of \(\mathbb {X}\). Assume that (6.2)–(6.7) are satisfied. Let \(u:\mathbb {R}^d\rightarrow \mathbb {R}\) be measurable and bounded, and let \(B\subset \mathbb {R}^d\) be a Borel set with \(\lambda _d(B)<\infty \). Finally, assume that \(\sigma ^2_B{:}{=}{{\,\mathrm{{\mathbb Var}}\,}}(\varvec{\delta }(u_B))>0\). Then there exists a constant \(c>0\), not depending on B, such that

$$\begin{aligned} \max \big \{d_W\big (\sigma _B^{-1}\varvec{\delta }(u_B),N\big ), d_K\big (\sigma _B^{-1}\varvec{\delta }(u_B),N\big )\big \} \le c \lambda _d(B)^{1/2}\sigma ^ {-2}_B+c \lambda _d(B)\sigma ^ {-3}_B. \end{aligned}$$
(6.9)

Proof

We apply Theorem 2.1 with \(G_B/\sigma _B\) in place of G. For notational simplicity, we omit the subscript B of \(G_B\). We need to bound the terms \(T_i\) for \(i\in \{1,\ldots ,9\}\). The assumptions of Theorem 2.1 are checked at the end of the proof. For simplicity, assume that |u| is bounded by 1. The value of a constant c might change from line to line. We often write \(D_{s,x}\) instead of \(D_{(s,x)}\).

The term \(T_3'{:}{=} \sigma ^{3}_B T_3\) satisfies

$$\begin{aligned} T_3'\le \mathbb {E}\int _B \varphi _s\,ds \le c \lambda _d(B), \end{aligned}$$

where the second inequality follows from assumptions (6.3) and (6.7). Here and later, we often use that \(\theta _s\eta \) and \(\eta \) have the same distribution for each \(s\in \mathbb {R}^d\), whence \(Y_s\) has the same distribution for all \(s\in \mathbb {R}^d\) and the same holds for \(\lambda (Z_s)\).

We deduce from (6.2) that, for \((s,x)\in \mathbb {X}\), \((t,y)\notin Z_s\) and \(\nu \in {\textbf{N}}\) with \(\nu (\mathbb {X})\le 2\),

$$\begin{aligned} {{\textbf{1}}}\{ x\le \varphi _s(\eta +\nu +\delta _{(t,y)}) \}&= {{\textbf{1}}}\{ x\le \varphi _s((\eta +\nu +\delta _{(t,y)})_{Z_s}) \} \\&= {{\textbf{1}}}\{ x\le \varphi _s((\eta +\nu )_{Z_s}) \} = {{\textbf{1}}}\{ x\le \varphi _s(\eta +\nu ) \}, \end{aligned}$$

whence the first three difference operators of \({{\textbf{1}}}\{ x\le \varphi _s \}\) vanish if one of the additional points is outside of \(Z_s\). From (6.3), we see that \({{\textbf{1}}}\{ x\le \varphi _s \}\) and its first three difference operators become zero if \(x> Y_s\). In the following, these observations are frequently used to bound difference operators in terms of indicator functions.

First we consider \(T_1\). Writing the square of the inner integral as a double integral, we have

$$\begin{aligned} T'_1{:}{=}\sigma ^4_BT^2_1\le \mathbb {E}\int {{\textbf{1}}}\{s,r\in B\} |D_{t,y}{{\textbf{1}}}\{x\le \varphi _s\}||D_{t,y}{{\textbf{1}}}\{z\le \varphi _r\}|\,d(s,x,t,y,r,z). \end{aligned}$$

By the discussed behaviour of the difference operators,

$$\begin{aligned} T'_1&\le c\,\mathbb {E}\int {{\textbf{1}}}\{s,r\in B\} {{\textbf{1}}}\{(t,y)\in Z_s\cap Z_r\}{{\textbf{1}}}\{x\le Y_s,z\le Y_r\}|\,d(s,x,t,y,r,z)\\&= c\,\mathbb {E}\int _{B^2} \lambda (Z_s\cap Z_r)Y_sY_r\,d(s,r)\\&\le c\,\int _{B^2} \big (\mathbb {E}\lambda (Z_s\cap Z_r)^3\big )^{1/3}\big (\mathbb {E}Y^3_s\big )^{1/3} \big (\mathbb {E}Y_r^3\big )^{1/3}\,d(s,r), \end{aligned}$$

where we have used Hölder’s inequality. By (6.7), \(\mathbb {E}Y_s^3=\mathbb {E}Y_r^3=\mathbb {E}Y_0^3<\infty \). Moreover,

$$\begin{aligned} \mathbb {E}\lambda (Z_s\cap Z_r)^3&=\mathbb {E}\lambda ((Z(\theta _s\eta )+s)\cap (Z(\theta _r\eta )+r))^3\\&=\mathbb {E}\lambda ((Z(\theta _{s-r}\eta )+s-r)\cap Z(\eta ))^3. \end{aligned}$$

Therefore,

$$\begin{aligned} T'_1&\le c\,\int {{\textbf{1}}}\{s\in \mathbb {R}^d,r\in B\} \big (\mathbb {E}\lambda ((Z(\theta _{s}\eta )+s)\cap Z(\eta ))^3\big )^{1/3}\,d(s,r)\\&=c\lambda _d(B) \int _{\mathbb {R}^d}\big (\mathbb {E}\lambda (Z_s\cap Z_0)^3\big )^{1/3}\,ds \le c\lambda _d(B), \end{aligned}$$

where we have used assumption (6.4) (and the monotonicity of \(L_p\)-norms). Hence, \(T_1\le c \lambda _d(B)^{1/2}\sigma ^ {-2}_B\), as required by (6.9).

For the term \(T_2\), we have

$$\begin{aligned} T'_2{:}{=}\sigma ^4_BT^2_2\le \mathbb {E}\int \bigg (\int {{\textbf{1}}}\{s,t\in B\}|D_{r,z}(D_{s,x}{{\textbf{1}}}\{y\le \varphi _t\} D_{t,y}{{\textbf{1}}}\{x\le \varphi _s\})| \,d(s,x,t,y)\bigg )^2d(r,z). \end{aligned}$$

The inner integrand does only contribute if \((s,x)\in Z_t\), \((t,y)\in Z_s\), and \((r,z)\in Z_t\) or \((r,z)\in Z_s\). Since the last two cases are symmetric, \(T'_2\) can be bounded by

$$\begin{aligned} c\,\mathbb {E}\int \bigg (\int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{(r,z)\in Z_t,(s,x)\in Z_t,(t,y)\in Z_s\} \,d(s,x,t,y)\bigg )^2d(r,z). \end{aligned}$$

By Fubini’s theorem,

$$\begin{aligned} T'_{2}&\le c\,\mathbb {E}\int {{\textbf{1}}}\{t,t'\in B\} \lambda (Z_t\cap Z_{t'}) {{\textbf{1}}}\{(s,x)\in Z_t,(t,y)\in Z_s,(s',x')\in Z_{t'},(t',y')\in Z_{s'}\} \\&\qquad \qquad \qquad \times \,d(s,x,t,y,s',x',t',y')\\&\le c\int {{\textbf{1}}}\{t,t'\in B\} \big (\mathbb {E}\lambda (Z_t\cap Z_{t'})^3\big )^{1/3} \,\mathbb {P}((s,x)\in Z_t,(t,y)\in Z_s)^{1/3}\\&\qquad \qquad \qquad \times \mathbb {P}((s',x')\in Z_{t'},(t',y')\in Z_{s'})^{1/3} \,d(s,x,t,y,s',x',t',y'). \end{aligned}$$

By definition of \(Z_t\) and \(Z_s\) and the distributional invariance of \(\eta \),

$$\begin{aligned} \mathbb {P}((s,x)\in Z_t,(t,y)\in Z_s)&=\mathbb {P}((s-t,x)\in Z(\theta _t\eta ),(t-s,y)\in Z(\theta _s\eta ))\\&=\mathbb {P}((s-t,x)\in Z(\eta ),(t-s,y)\in Z(\theta _{s-t}\eta )). \end{aligned}$$

Changing variables yields that

$$\begin{aligned} T'_{2}&\le cb^2\,\int _{B^2} (\mathbb {E}\lambda (Z_t\cap Z_{t'})^3)^{1/3} \,d(t,t'), \end{aligned}$$

where

$$\begin{aligned} b{:}{=}\int \mathbb {P}((s,x)\in Z(\eta ),(-s,y)\in Z(\theta _{s}\eta ))^{1/3}\,d(s,x,y). \end{aligned}$$

Since

$$\begin{aligned} \mathbb {P}((s,x)\in Z(\eta ),(-s,y)\in Z(\theta _{s}\eta )) =\mathbb {P}((s,x)\in Z_0,(0,y)\in Z_s), \end{aligned}$$

we obtain from assumption (6.6) that \(b<\infty \). Hence,

$$\begin{aligned} T'_{2}&\le c\,\int _{B^2} \big (\mathbb {E}\lambda (Z_t\cap Z_{t'})^{3}\big )^{1/3} \,d(t,t') =c \int _{B^2} \big (\mathbb {E}\lambda (Z_{t-t'}\cap Z_{0})^3\big )^{1/3} \,d(t,t')\le c\lambda _d(B), \end{aligned}$$

where we have used assumption (6.4).

Each of the summands in the term \(T'_4{:}{=}\sigma ^3_BT_4\) includes the factor \(D_{s,x}{{\textbf{1}}}\{ y\le \varphi _t \}\), so that

$$\begin{aligned} T_4'&\le c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{(s,x)\in Z_t\} {{\textbf{1}}}\{y \le Y_t\} \, d(s,x,t,y) = c \mathbb {E}\int _{B} \lambda (Z_t) Y_t \, dt \\&\le c \int _{B} \big (\mathbb {E}\lambda (Z_t)^2 \big )^{1/2} \big (\mathbb {E}Y_t^2 \big )^{1/2} \, dt = c \big (\mathbb {E}\lambda (Z_0)^2\big )^{1/2} \big (\mathbb {E}Y_0^2 \big )^{1/2} \lambda _d(B). \end{aligned}$$

For \(T'_5{:}{=}\sigma ^3_BT_5\), we have

$$\begin{aligned} T_5' \le c\, \mathbb {E}\int {{\textbf{1}}}\{r\in B\} {{\textbf{1}}}\{(s,x) \in Z_t\} {{\textbf{1}}}\{ (t,y) \in Z_r \} {{\textbf{1}}}\{z\le Y_r\} \, d(s,x,t,y,r,z), \end{aligned}$$

where in the second term of \(T_5\) we renamed x as y and vice versa. This leads to the upper bound

$$\begin{aligned} T_5'&\le c\, \mathbb {E}\int {{\textbf{1}}}\{r\in B\} {{\textbf{1}}}\{ (t,y) \in Z_r \} \lambda (Z_t) Y_r \, d(t,y,r) \\&\le c \int {{\textbf{1}}}\{r\in B\} \mathbb {P}((t,y) \in Z_r)^{1/3} \big (\mathbb {E}\lambda (Z_t)^3\big )^{1/3} \big (\mathbb {E}Y_r^3\big )^{1/3} \, d(t,y,r) \\&= c \big (\mathbb {E}\lambda (Z_0)^3\big )^{1/3} \big (\mathbb {E}Y_0^3\big )^{1/3} \int \mathbb {P}((t,y) \in Z_0)^{1/3} \, d(t,y) \lambda _d(B). \end{aligned}$$

We can rewrite \(T'_6{:}{=}\sigma ^4_B T_6^2\) as sum of \(T_{6,1}'\) and \(T_{6,2}'\) with

$$\begin{aligned} T_{6,1}'&\le c\, \mathbb {E}\int \bigg ( \int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{ y\le Y_t \} {{\textbf{1}}}\{(s,x)\in Z_t\} \, d(s,x) \bigg )^2 \, d(t,y) \\&\le c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{ y\le Y_t \} \lambda (Z_t)^2 \, d(t,y) = c \mathbb {E}\int _B Y_t \lambda (Z_t)^2 \, dt \\&\le c \int _B \big (\mathbb {E}Y_t^3\big )^{1/3} \big (\mathbb {E}\lambda (Z_t)^3\big )^{2/3} \, dt = c \big (\mathbb {E}Y_0^3\big )^{1/3} \big (\mathbb {E}\lambda (Z_0)^3\big )^{2/3} \lambda _d(B) \end{aligned}$$

and

$$\begin{aligned}&T_{6,2}' \le c\, \mathbb {E}\int \bigg ( \int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{ y\le Y_t \} {{\textbf{1}}}\{(s,x)\in Z_t\} {{\textbf{1}}}\{ (r,z)\in Z_s\cup Z_t \} \, d(s,x) \bigg )^2 \, d(t,y,r,z) \\&\quad = c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{ y\le Y_t \} {{\textbf{1}}}\{(s,x)\in Z_t\} {{\textbf{1}}}\{ (r,z)\in Z_s\cup Z_t \} \\&\qquad \qquad \quad \times {{\textbf{1}}}\{(s',x')\in Z_t\} {{\textbf{1}}}\{ (r,z)\in Z_{s'}\cup Z_t \} \, d(s,x,s',x',t,y,r,z) \\&\quad = c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} Y_t {{\textbf{1}}}\{(s,x),(s',x')\in Z_t\} \lambda (( Z_s\cup Z_t)\cap (Z_{s'}\cup Z_t)) \, d(s,x,s',x',t) \\&\quad \le c \int {{\textbf{1}}}\{t\in B\} \mathbb {P}((s,x)\in Z_t)^{1/4} \mathbb {P}((s',x')\in Z_t)^{1/4} \big (\mathbb {E}Y_t^4\big )^{1/4} \\&\qquad \qquad \quad \times \big ( \big (\mathbb {E}\lambda (Z_s)^4\big )^{1/4}+\big (\mathbb {E}\lambda (Z_t)^4\big )^{1/4}\big ) \, d(s,x,s',x',t) \\&\quad = 2c \big (\mathbb {E}Y_0^4\big )^{1/4} \big (\mathbb {E}\lambda (Z_0)^4\big )^{1/4} \bigg ( \int \mathbb {P}((s,x)\in Z_0)^{1/4} \, d(s,x) \bigg )^2 \lambda _d(B). \end{aligned}$$

For \(T'_7{:}{=}\sigma ^4_B T_7^2\), the first term can be bounded as \(T_3'\), while the second term is bounded by

$$\begin{aligned} c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{ (s,x)\in Z_t\} {{\textbf{1}}}\{y\le Y_t\} \, d(s,x,t,y), \end{aligned}$$
(6.10)

which we treated above in order to control \(T_4'\).

We can decompose \(T_8'{:}{=}\sigma _B^4 T_8^2\) into two terms \(T_{8,1}'\) and \(T_{8,2}'\), where \(T_{8,1}'\) can be bounded as \(T_{6,1}'\). Since the product of two difference operators in \(T_{8,2}'\) is bounded by the sum of the squared difference operators, \(T_{8,2}'\) can be controlled as \(T_{6,2}'\).

Note that \(T_9'{:}{=} \sigma _B^4 T_9^2\) can be written as a sum of three terms \(T_{9,1}',T_{9,2}',T_{9,3}'\), where \(T_{9,i}'\) is an integral with respect to i points for \(i\in \{1,2,3\}\). The term \(T_{9,1}'\) can be bounded by (6.10), while

$$\begin{aligned} T_{9,2}'&\le c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} {{\textbf{1}}}\{ y \le Y_t \} {{\textbf{1}}}\{(s,x)\in Z_t\} {{\textbf{1}}}\{ (r,z) \in Z_s \cup Z_t \} \, d(s,x,t,y,r,z) \\&\le c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} Y_t {{\textbf{1}}}\{(s,x)\in Z_t\} (\lambda (Z_s) + \lambda (Z_t)) \, d(s,x,t) \\&\le c \int {{\textbf{1}}}\{t\in B\} \big (\mathbb {E}Y_t^3\big )^{1/3} \mathbb {P}((s,x)\in Z_t)^{1/3} \big (\big (\mathbb {E}\lambda (Z_s)^3\big )^{1/3} + \big (\mathbb {E}\lambda (Z_t)^3\big )^{1/3}\big ) \, d(s,x,t) \\&\le 2c \big (\mathbb {E}Y_0^3 \big )^{1/3} \big (\mathbb {E}\lambda (Z_0)^3\big )^{1/3} \int \mathbb {P}((s,x)\in Z_0)^{1/3} \, d(s,x) \lambda _d(B). \end{aligned}$$

For \(T_{9,3}'\), we deduce the bound

$$\begin{aligned} T_{9,3}'&\le c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\}{{\textbf{1}}}\{y\le Y_t\} {{\textbf{1}}}((s,x)\in Z_t) {{\textbf{1}}}\{(s',x'),(r,z)\in Z_s\cup Z_t\}\, \\&\quad \times d(s,x,t,y,r,z,s',x') \\&\le c\, \mathbb {E}\int {{\textbf{1}}}\{t\in B\} Y_t {{\textbf{1}}}((s,x)\in Z_t)(\lambda (Z_s)+\lambda (Z_t))^2 \, d(s,x,t), \end{aligned}$$

which can be treated similarly as in the computation for \(T_{9,2}'\) but with the power 4.

Finally, we check the assumptions of Theorem 2.1. The expression in (2.2) can be treated as \(T_3'\), while (2.3), (2.7) and (2.8) can be bounded as \(T_4'\). Similarly, we can verify (2.4), (2.5) and (2.9) by using the computations for \(T_{9,2}'\), \(T_{9,3}'\) and \(T_{6,2}'\), respectively. \(\square \)

Remark 6.4

Theorem 6.3 can be used to establish central limit theorems. Consider, for instance, the setting of Remark 6.1. Two possible choices of \(Z_t\) are provided in Example 6.2. Since \(\varphi \) is assumed to be predictable in Remark 6.1, the cyclic condition (2.12) is satisfied and (2.6) simplifies to

$$\begin{aligned} \sigma _T^2{:}{=}{{\,\mathrm{{\mathbb Var}}\,}}(\varvec{\delta }(u_T))=\int ^T_0 u(t)^2\, \mathbb {E}\varphi (t,\eta )\,dt. \end{aligned}$$

It is natural to assume that \(\sigma _T^2\ge c T\) for some \(c>0\) and all sufficiently large T. If, additionally, the assumptions of Theorem 6.3 are satisfied, then (6.9) shows that

$$\begin{aligned} \max \big \{d_W\big (\sigma _T^{-1}\varvec{\delta }(u_T),N\big ),d_K\big (\sigma _T^{-1}\varvec{\delta }(u_T),N\big )\big \} \le c' T^{-1/2} \end{aligned}$$

for some \(c'>0\) and all sufficiently large T. It does not seem to be possible to derive the Wasserstein part of this bound from [25, Theorem 3.1]; see also [6, Remark 3.8]. The reason is that the third term on the right-hand side of [25, (3.9)] does not have the appropriate order.

Example 6.5

Let \(h:\mathbb {R}^d\rightarrow \mathbb {R}_+\) be a measurable function satisfying \(\int (h(s)+h(s)^2)\,ds<\infty \). Define \(Z{:}{=}\{(s,x)\in \mathbb {R}^d\times \mathbb {R}_+: x\le h(s)\}\) and \(Z_t{:}{=}Z+t\), \(t\in \mathbb {R}^d\). We interpret Z and \(Z_t\) as constant mappings on \(\textbf{N}\) and check that (6.4)-(6.6) are satisfied. For (6.4), we note that

$$\begin{aligned}&\int \lambda (Z_0\cap Z_s)\,ds =\int {{\textbf{1}}}\{y\le h(t),y\le h(t-s)\}\,d(t,y,s)\\&\quad =\int {{\textbf{1}}}\{y\le h(t),y\le h(s)\}\,d(t,y,s) =\int \bigg (\int {{\textbf{1}}}\{y\le h(s)\}\,ds\bigg )^2\,dy. \end{aligned}$$

Since h is square integrable, we have \(\int {{\textbf{1}}}\{y\le h(s)\}\,ds\le c y^{-2}\) for some \(c>0\), so that the above integral is finite. Relation (6.5) follows at once from the integrability of h, while the left-hand side of (6.6) is bounded by \(\int h(s)^2\,ds\).

Assume now that the function \(\varphi \) satisfies

$$\begin{aligned} \varphi (t,\mu )=\varphi (t,\mu _{Z_t}),\quad (t,\mu )\in \mathbb {R}^d\times \textbf{N}. \end{aligned}$$

Then (6.2) holds. Assumptions (6.3) and (6.7) depend on the choice of \(\varphi \). They are satisfied, for instance, if \(\varphi (t,\cdot )\) is a polynomial or exponential function of \(\mu (Z_t)\).

Assume that u and \(\mathbb {E}\varphi (\cdot ,\eta _{Z})\) have a lower bound \(c>0\) and that \(\varphi (s,\cdot )\) is for all \(s\in \mathbb {R}^d\) either increasing or decreasing when adding a point. Then Theorem 6.3 yields a (quantitative) central limit theorem for \(\lambda _d(B)\rightarrow \infty \). To this end, we need to find a lower bound for \(\sigma ^2_B\), given by (2.6). In our case, the first term on the right-hand side of (2.6) equals

$$\begin{aligned} \mathbb {E}\int {{\textbf{1}}}\{s\in B\}u(s)^2{{\textbf{1}}}\{x\le \varphi (s,\eta _{Z_s})\}\,d(s,x) \end{aligned}$$

and has the lower bound

$$\begin{aligned} c^2\,\int {{\textbf{1}}}\{s\in B\}\mathbb {E}\varphi (s,\eta _{Z_s})\,ds\ge c^3\lambda _d(B). \end{aligned}$$

The second term is given by

$$\begin{aligned} \mathbb {E}\int {{\textbf{1}}}\{s,t\in B\}u(s)u(t) D_{t,y}{{\textbf{1}}}\{x\le \varphi (s,\eta )\}D_{s,x}{{\textbf{1}}}\{y\le \varphi (t,\eta )\}\,d(s,x,t,y). \end{aligned}$$

By the monotonicity assumption on \(\varphi \) and \(u\ge c\), this is non-negative.

Example 6.6

For a point configuration \(\mu \in {\textbf{N}}\) and \(w\in \mathbb {X}\), the Voronoi cell of w is given by

$$\begin{aligned} V(w,\mu ) {:}{=} \{v\in \mathbb {X}: \Vert w-v\Vert \le \Vert w'-v\Vert \text { for all } w'\in \mu \}, \end{aligned}$$

i.e. \(V(w,\mu )\) is the set of all points in \(\mathbb {X}\) such that no point of \(\mu \) is closer than w. The cells \((V(w,\mu ))_{w\in \mu }\) have disjoint interiors and form a tessellation of \(\mathbb {X}\), the so-called Voronoi tessellation, which is an often studied model from stochastic geometry (see, for example, [21, Section 10.2]). From the Poisson–Voronoi tessellation (i.e. the Voronoi tessellation with respect to \(\eta \)), we construct the point process

$$\begin{aligned} \xi {:}{=} \int {\textbf{1}}\{s\in \cdot \} {\textbf{1}}\{ V((s,x),\eta )\cap (\mathbb {R}^d\times \{0\})\ne \varnothing \} \, \eta (d(s,x)). \end{aligned}$$
(6.11)

This point process has the following geometric interpretation. We take all cells of the Poisson–Voronoi tessellation that intersect \(\mathbb {R}^d\times \{0\}\), which one can think of as the lowest layer of the Poisson–Voronoi tessellation, and the first coordinates of their nuclei are the points of \(\xi \). The points of \(\xi \) build the projection of a one-sided version of the Markov path considered in [1].

First we check that \(\xi \) can be represented as in (6.1). For \(s\in \mathbb {R}^d\), \(x_1,x_2\in \mathbb {R}_+\) with \(x_1<x_2\) and \(\mu \in {\textbf{N}}\), we have

$$\begin{aligned} V((s,x_1),\mu ) \cap (\mathbb {R}^d\times \{0\}) \supset V((s,x_2),\mu ) \cap (\mathbb {R}^d\times \{0\}). \end{aligned}$$
(6.12)

If \(V((s,0),\mu )\) is bounded, which is for \(\mathbb {P}_\eta \)-a.e. \(\mu \) the case, there exists a unique \(x_0\in \mathbb {R}_+\) such that \(V((s,x_0),\mu ) \cap (\mathbb {R}^d\times \{0\})\) is exactly a single point. This allows us to rewrite \(\xi \) as

$$\begin{aligned} \xi = \int {{\textbf{1}}}\{s\in \cdot \} {{\textbf{1}}}\{ x \le \varphi (s,\eta -\delta _{(s,x)}) \} \, \eta (d(s,x)) \end{aligned}$$

with

$$\begin{aligned} \varphi (s,\mu ){:}{=}\sup \{x\in \mathbb {R}_+: V((s,x),\mu ) \cap (\mathbb {R}^d\times \{0\})\ne \varnothing \}. \end{aligned}$$

For \(s\in \mathbb {R}^d\) and \(\mu \in {\textbf{N}}\), let

$$\begin{aligned} R(s,\mu ){:}{=}\sup \{\Vert (s,0)-v\Vert : v\in V((s,0),\mu )\}, \end{aligned}$$

which is the maximal distance from (s, 0) to a point of its Voronoi cell. Note that \(V((s,0),\mu )\) is completely determined by the points of \(\mu \) in \(B((s,0),2R(s,\mu ))\), the closed ball in \(\mathbb {X}\) with radius \(2R(s,\mu )\) around (s, 0). Indeed, the centres of all neighbouring cells to the Voronoi cell of (s, 0) are within this ball and all other points of \(\eta \) outside are too far away to affect the cell. If we consider \(V((s,x),\mu )\cap (\mathbb {R}^d\times \{0\})\) as a function of x, for increasing x the sets \(V((s,x),\mu )\cap (\mathbb {R}^d\times \{0\})\) are not increasing (see (6.12)) and \((V((s,0),\mu )\cap (\mathbb {R}^d\times \{0\})) {\setminus } (V((s,x),\mu )\cap (\mathbb {R}^d\times \{0\}))\) is divided among the neighbouring cells of \(V((s,0),\mu )\). This implies that \(V((s,x),\mu )\cap (\mathbb {R}^d\times \{0\})\) is also completely determined by the points in \(B(s,2R((s,0),\mu ))\). Hence, we can conclude that

$$\begin{aligned} \varphi (s,\mu ) = \varphi (s,\mu _{B((s,0),2R(s,\mu ))}). \end{aligned}$$

Since this identity is still valid if we restrict \(\mu \) to a larger set on the right-hand side and R is non-increasing with respect to the point configuration, we obtain

$$\begin{aligned} \varphi (s,\mu +\mu ') = \varphi (s, (\mu +\mu ')_{B((s,0),2R(s,\mu +\mu '))} ) = \varphi (s, (\mu +\mu ')_{B((s,0),2R(s,\mu ))}) \end{aligned}$$

for all \(\mu '\in {\textbf{N}}\) with \(\mu '(\mathbb {X})\le 3\), which is (6.2) with \(Z_s=B((s,0),2R(s,\mu ))\). Since for each point of \(V((s,0),\mu )\cap (\mathbb {R}^d\times \{0\})\), there exists a point of \(\mu \) different from (s, 0) which is at most \(2R(s,\mu )\) away, we obtain

$$\begin{aligned} \varphi (s,\mu +\mu ') \le \varphi (s,\mu ) \le 2R(s,\mu ), \end{aligned}$$

which is (6.3).

Note that for any \(s\in \mathbb {R}^d\) one can partition \(\mathbb {X}\) into finitely many cones \({\mathcal {C}}_1,\ldots ,{\mathcal {C}}_m\) with apex (s, 0) such that

$$\begin{aligned} \max _{i\in \{1,\ldots ,m\}} \inf _{y\in \mu \cap {\mathcal {C}}_i} \Vert y - (s,0)\Vert \ge R(s,\mu ) \end{aligned}$$

for all \(\mu \in {\textbf{N}}\) (see, for example, [17, Subsection 6.3]). Hence, there exist constants \(C,c>0\) such that

$$\begin{aligned} \mathbb {P}(R(s,\eta )\ge u) \le C \exp (-c u^{d+1}) \end{aligned}$$

for all \(u\ge 0\) and \(s\in \mathbb {R}^d\). Using this exponential decay, it is easy to verify (6.4)–(6.7). Relations (6.5) and (6.7) are obvious. To see (6.4), we can use the bound

$$\begin{aligned}{} & {} \lambda (B((0,0),2R(0,\eta ))\cap B((s,0),2R(s,\eta )))^4\\{} & {} \le {{\textbf{1}}}\{2R(0,\eta )>\Vert s\Vert /2\}\lambda (B((0,0),2R(0,\eta )))^4\\{} & {} \quad +{{\textbf{1}}}\{2R(s,\eta )>\Vert s\Vert /2\}\lambda (B((s,0),2R(s,\eta )))^4. \end{aligned}$$

For (6.6), we can bound \(\mathbb {P}((s,x)\in B((0,0),2R(0,\eta )), (0,y)\in B((s,0),2R(s,\eta )))\) by the Cauchy–Schwarz inequality and then bound the resulting integral. This yields that the conclusions of Theorem 6.3 hold for the point process \(\xi \) from (6.11).

Since \(\varphi \) is non-increasing with respect to additional points, one can argue as in the previous example to see that there is a lower bound for the variance of order \(\lambda _d(B)\) if \(u>c_0\) for some \(c_0>0\). This yields a (quantitative) central limit theorem as \(\lambda _d(B)\rightarrow \infty \).

7 Functionals Generated by a Partial Order

In this section, we return to the setting of a general \(\sigma \)-finite measure space \((\mathbb {X},\mathcal {X},\lambda )\). In many situations, the functional \(G_x\) can be written as \(G_x(\mu )=f(x)H_x(\mu )\), where \(f\in L^2(\lambda )\) and the functional \(H_x(\mu )\) is measurable in both arguments, takes values in \(\{0,1\}\) and can be decomposed as

$$\begin{aligned} H_x(\mu )=\prod _{y\in \mu } H_x(\delta _y). \end{aligned}$$
(7.1)

Write shortly \(H_x(y)\) instead of \(H_x(\delta _y)\), and denote \(\overline{H}_x(y){:}{=}1-H_x(y)\). A generic way to construct such functionals is to consider a strict partial order \(\prec \) on \(\mathbb {X}\) and to set \(H_x(y){:}{=}1-{{\textbf{1}}}\{y\prec x\}\). The set of points \(x\in \eta \) such that \(H_x(\eta )=1\) is called the set of Pareto optimal points with respect to the chosen partial order, i.e. \(x\in \eta \) is Pareto optimal if there exists no \(y\in \eta \) such that \(y\prec x\). For \(x\notin \eta \), we have \(H_x(\eta )=1\) if x is Pareto optimal in \(\eta +\delta _x\). If \(\varvec{\delta }(G)\) can be defined pathwise as in (1.1), then it equals the sum of the values of f over Pareto optimal points centred by the integral of f over the set of x such that \(H_x(\eta )=1\). As shown in [12], such examples naturally arise in statistical applications.

It is easy to see by induction that

$$\begin{aligned} D^m_{z_1,\dots ,z_m}G_x(\mu )=(-1)^m f(x) H_x(\mu ) \prod _{i=1}^m \overline{H}_x(z_i). \end{aligned}$$
(7.2)

In particular,

$$\begin{aligned} D_zG_x(\mu )=-f(x)H_x(\mu )\overline{H}_x(z). \end{aligned}$$
(7.3)

By construction, \(H_y(\eta )=1\) and \(\overline{H}_y(x)=1\) yield that \(H_x(\eta )=1\), which can be expressed as

$$\begin{aligned} H_x(\eta )H_y(\eta )\overline{H}_y(x)=H_y(\eta )\overline{H}_y(x), \end{aligned}$$
(7.4)

so that

$$\begin{aligned} G_xD_xG_y =f(x) D_xG_y. \end{aligned}$$
(7.5)

The asymmetry property of the strict partial order implies that \(\overline{H}_x(y)\overline{H}_y(x)=0\) for all \(x,y\in \mathbb {X}\). Hence, the functional G satisfies the cyclic condition (2.12). Thus, the second term on the right-hand side of (2.6) vanishes. If (2.2) and (2.3) are satisfied, it follows from [13, Proposition 2.3] that the KS-integral \(\varvec{\delta }(G)\) of G is well defined and

$$\begin{aligned} \mathbb {E}\varvec{\delta }(G)^2=\mathbb {E}\int f(x)^2 H_x(\eta )\,\lambda (dx). \end{aligned}$$
(7.6)

In addition, property (7.1) leads to a considerable simplification of the terms arising in the bounds in Corollary 2.2. Write \(H_x\) as a shorthand for \(H_x(\eta )\), and denote

$$\begin{aligned} h_i(y){:}{=}\int f(x)^i\overline{H}_y(x)\,\lambda (dx), \quad i=0,1,2, \end{aligned}$$

and

$$\begin{aligned} {\tilde{h}}(y){:}{=}\int |f(x)|\overline{H}_y(x)\,\lambda (dx). \end{aligned}$$

Proposition 1.17

Assume that \(G_x(\mu )=f(x)H_x(\mu )\), where \(f\in L^2(\lambda )\) and the functional H is determined by (7.1) from a strict partial order on \(\mathbb {X}\). Then the terms \(T_2\) and \(T_8\) defined before Theorem 2.1 vanish and the other terms satisfy

$$\begin{aligned} T_1&=\bigg (\int f(x)^2f(z)^2 \mathbb {E}H_xH_z \overline{H}_x(y)\overline{H}_z(y) \,\lambda ^3(d(x,y,z))\bigg )^{1/2},\\ T_3&=\int |f(x)|^3 \mathbb {E}H_x\,\lambda (dx),\\ T_4&\le \int \big (2 h_2(y)|f(y)|+3 {\tilde{h}}(y)f(y)^2\big ) \mathbb {E}H_y\,\lambda (dy),\\ T_5&\le 8 \int {\tilde{h}}(z)^2|f(z)|\mathbb {E}H_z\,\lambda (dz),\\ T_6&=\bigg (\int \big (f(y)h_1(y)\big )^2\big (1+h_0(y)\big ) \mathbb {E}H_y\,\lambda (dy)\bigg )^{1/2},\\ T_7&= \Big (\int |f(x)|^4 \mathbb {E}H_x \,\lambda (dx)\Big )^{1/2},\\ T_9&= \bigg (\int f(y)^2\Big [3+3h_0(y)+2h_0(y)^2\Big ]h_2(y) \mathbb {E}H_y \,\lambda (dy)\bigg )^{1/2}. \end{aligned}$$

Suppose \({{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)>0\) and that (2.2)–(2.5) are satisfied. Then

$$\begin{aligned} d_W\bigg (\frac{\varvec{\delta }(G)}{\sqrt{{{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)}},N\bigg )\le \frac{T_1}{{{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)} + \frac{T_3+T_4+T_5}{\sqrt{{{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)}^3}. \end{aligned}$$

If, additionally, (2.7)–(2.9) are satisfied, then

$$\begin{aligned} d_K\bigg (\frac{\varvec{\delta }(G)}{\sqrt{{{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)}},N\bigg )\le \frac{T_1+T_6+2(T_7+T_9)}{{{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)}. \end{aligned}$$

Proof

The expression for \(T_1\) follows from \(G_x^2=f(x)G_x\) for \(x\in \mathbb {X}\) and (7.3), while \(T_3\) results from the definition of \(G_x\). Now consider the further terms, appearing in Corollary 2.2. We rely on (7.2) with \(m=2,3\), (7.3) and (7.5) in the subsequent calculations. First,

$$\begin{aligned} T_4&=\mathbb {E}\int \Big (2f(x)^2 |f(y)| H_y\overline{H}_y(x)\\&\qquad \qquad \qquad +|f(x)|f(y)^2 H_y\overline{H}_y(x)\big (2H_y+H_y\overline{H}_y(x)\big ) \Big )\,\lambda ^2(d(x,y))\\&\le \int \big (2f(x)^2 |f(y)|+3 |f(x)|f(y)^2\big ) \mathbb {E}H_y\overline{H}_y(x) \,\lambda ^2(d(x,y)), \end{aligned}$$

which yields the expression for \(T_4\) in view of the definitions of the functions \(h_2\) and \({\tilde{h}}\). Next,

$$\begin{aligned} T_5&=\mathbb {E}\int 2|f(x)f(y)f(z)| \big (H_z\overline{H}_z(y)+H_z\overline{H}_z(y)\overline{H}_z(x)\big )\\&\quad \times \big (H_y\overline{H}_y(x)\overline{H}_y(z)+ 2 H_y\overline{H}_y(x)\big ) \,\lambda ^3(d(x,y,z)) \\&= \mathbb {E}\int 2|f(x)f(y)f(z)| H_z\overline{H}_z(y)\big (1+\overline{H}_z(x)\big ) H_y\overline{H}_y(x)\big (\overline{H}_y(z)+2\big ) \,\lambda ^3(d(x,y,z)) \\&\le 8 \mathbb {E}\int |f(x)f(y)f(z)| H_z\overline{H}_z(y)H_y\overline{H}_y(x) \,\lambda ^3(d(x,y,z)) \\&\qquad \qquad = 8 \int |f(x)f(y)f(z)|\mathbb {E}H_z\overline{H}_z(y)\overline{H}_y(x) \,\lambda ^3(d(x,y,z)), \end{aligned}$$

where we used the fact that \(\overline{H}_z(y) \overline{H}_y(z)=0\) for all y and z as well as (7.4). This yields the sought bound for \(T_5\), taking into account that \(\overline{H}_z(y)\overline{H}_y(x)\le \overline{H}_z(y)\overline{H}_z(x)\). Next, \(T_6=(T_{6,1}+T_{6,2})^{1/2}\), where

$$\begin{aligned} T_{6,1}&{:}{=}\mathbb {E}\int \Big (\int f(x)f(y) H_y\overline{H}_y(x) \,\lambda (dx)\Big )^2\,\lambda (dy)\\&=\int f(y)^2 \mathbb {E}H_y \Big (\int f(x)\overline{H}_y(x)\,\lambda (dx)\Big )^2\,\lambda (dy) =\int f(y)^2 h_1(y)^2 \mathbb {E}H_y\,\lambda (dy) \end{aligned}$$

and

$$\begin{aligned} T_{6,2}&{:}{=}\mathbb {E}\int \Big (\int f(x)f(y)H_y\overline{H}_y(x)\overline{H}_y(z)\,\lambda (dx)\Big )^2\,\lambda ^2(d(y,z))\\&=\int f(y)^2 \mathbb {E}H_y \overline{H}_y(z) h_1(y)^2\,\lambda ^2(d(y,z)). \end{aligned}$$

Hence, the expression for \(T_6\) follows. The expression for \(T_7\) follows directly from the definition of \(G_x\). Finally, \(T_9=(3T_{9,1}+3T_{9,2}+2T_{9,3})^{1/2}\), where

$$\begin{aligned} T_{9,1}&{:}{=}\int f(x)^2 f(y)^2 \mathbb {E}H_y\overline{H}_y(x)\,\lambda ^2(d(x,y)),\\ T_{9,2}&{:}{=}\int f(x)^2f(y)^2 \mathbb {E}H_y\overline{H}_y(x)\overline{H}_y(z)\,\lambda ^3(d(x,y,z)),\\ T_{9,3}&{:}{=}\int f(x)^2f(y)^2\mathbb {E}H_y\overline{H}_y(x)\overline{H}_y(z)\overline{H}_y(w) \,\lambda ^4(d(x,y,z,w)). \end{aligned}$$

Thus,

$$\begin{aligned} T_9&= \bigg (\int f(x)^2f(y)^2\Big [3+3h_0(y)+2h_0(y)^2] \mathbb {E}H_y \overline{H}_y(x)\,\lambda ^2(d(x,y))\bigg )^{1/2}, \end{aligned}$$

which yields the formula for \(T_9\). The bounds for the normal approximation follow from Corollary 2.2 and the normalisation by \(\sqrt{{{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)}\). \(\square \)

Example 7.2

Let \(\mathbb {X}\) be the unit cube \([0,1]^d\) with the Lebesgue measure \(\lambda \). For \(x,y\in \mathbb {X}\), write \(y\prec x\) if \(x\ne y\) and all components of y are not greater than the corresponding components of x. Let \(G_x(\mu )=H_x(\mu )\), with \(H_x(\mu )\) given by (7.1) and \(\overline{H}_x(y){:}{=}{{\textbf{1}}}\{y\prec x\}\).

Let \(\eta _t\) be the Poisson process on \(\mathbb {X}\) of intensity \(t\lambda \). Then \(G_x(\eta _t)=1\) means that none of the points \(y\in \eta _t\) satisfies \(y\prec x\), that is, none of the points from \(\eta _t\) is smaller than x in the coordinatewise order. In this case, x is said to be a Pareto optimal point in \(\eta _t+\delta _x\). Then \(\varvec{\delta }(G)\) equals the difference between the number of Pareto optimal points in \(\eta _t\) and the volume of the complement of the set of points \(x\in \mathbb {X}\) such that \(y\prec x\) for at least one \(y\in \eta _t\).

For \(x=(x_1,\dots ,x_d)\in \mathbb {X}\), denote \(|x|{:}{=}x_1\cdots x_d\). Then \(\mathbb {E}H_x(\eta _t)=e^{-t|x|}\), and (7.6) yields that the variance of \(\varvec{\delta }(G)\) is

$$\begin{aligned} \sigma ^2_t{:}{=}t\int e^{-t|z|}\, \lambda (dz). \end{aligned}$$

It is shown in [2] that the right-hand side is of order \(\log ^{d-1} t\) for large t. Note that the above formula gives also the expected number of Pareto optimal points.

Quantitative limit theorems for the number of Pareto optimal points centred by subtracting the mean and scaled by the standard deviation were obtained in [3]. Below we derive a variant of such result for the KS-integral, which involves a different stochastic centring.

Since \(G_x(\eta )=f(x)H_x(\eta )\) with the function f identically equal one and the measure \(\lambda \) is finite, the integrability conditions (2.2)–(2.5) and (2.7)–(2.9) are satisfied. The terms arising in Proposition 7.1 can be calculated as follows. First,

$$\begin{aligned} T_1^2&=t^{3}\int \mathbb {E}\big [H_x(\eta _t) H_y(\eta _t)\big ]|x\wedge y|\,\lambda ^2(d(x,y))\\&= t^{3} \int e^{-t(|x|+|y|-|x\wedge y|)} |x\wedge y|\, \lambda ^2(d(x,y)), \end{aligned}$$

where \(x\wedge y\) denotes the coordinatewise minimum of \(x,y\in [0,1]^d\). Fix a (possibly empty) set \(I\subseteq \{1,\dots ,d\}\), let \(J{:}{=}I^c\), and denote by \(x^I\) and \(x^J\) the subvectors of \(x\in [0,1]^d\) formed by coordinates from I and J. It suffices to restrict the integration domain to the set where \(x\wedge y=(x^I,y^J)\) and let \(T^2_{1,I}\) be the corresponding integral. Let m denote the cardinality of I. If \(m=0\), then

$$\begin{aligned} T_{1,I}^2 = t^{3} \int e^{-t|x|} |y|{{\textbf{1}}}\{y\prec x\}\,\lambda ^2(d(x,y)) = 2^{-d}t^{3} \int e^{-t|x|} |x|^2\,\lambda (dx) \le 27\cdot 2^{-d}\sigma _{t/3}^2. \end{aligned}$$

Here and in what follows, we use the inequality \(se^{-s}\le 1\) with \(s=t|y|\), which yields that

$$\begin{aligned} t^i\int |y|^{i-1} e^{-t|y|} \,\lambda (dy) \le t\int (t|y|e^{-t|y|/i})^{i-1} e^{-t|y|/i}\,\lambda (dy) \le i^{i}\sigma _{t/i}^2,\quad i\in \mathbb {N}. \end{aligned}$$

The same calculation applies if \(m=d\). If \(m\in \{1,\dots ,d-1\}\), then

$$\begin{aligned} T_{1,I}^2= & {} t^{3} \int _{[0,1]^d}e^{t|x^I|\,|y^J|}|x^I|\,|y^J| \left( \int _{[0,1]^{m}}e^{-t|y^I|\,|y^J|}{{\textbf{1}}}\{x^I\prec y^I\}\, dy^I\right) \\{} & {} \quad \times \left( \int _{[0,1]^{d-m}} e^{-t|x^I|\,|x^J|}{{\textbf{1}}}\{y^J\prec x^J\}\,dx^J\right) \,\lambda (d(x^I,y^J)). \end{aligned}$$

It can be shown by a small adaptation of the proof of [3, Lemma 3.1] that

$$\begin{aligned} s\int _{[0,1]^m} e^{-s|x|}{{\textbf{1}}}\{y\prec x\} \, dx \le Ce^{-s|y|/a}\Big [1+\big |\log (s|y|)\big |^{m-1}\Big ], \quad y\in [0,1]^m, \end{aligned}$$

for any \(a>1\) and a constant C that depends on m and a. Let \(a\in (1,2)\). Then, with \(s{:}{=}t|y^J|\), we have

$$\begin{aligned} t\int _{[0,1]^m} e^{-t|y^I|\,|y^J|} {{\textbf{1}}}\{x^I\prec y^I\} |y^J| dy^I \le Ce^{-t|y^J||x^I|/a}\Big [1+\big |\log (t|y^J||x^I|)\big |^{m-1}\Big ]. \end{aligned}$$

By applying the same argument to the integral over \([0,1]^{d-m}\), we have that

$$\begin{aligned} T_{1,I}^2 \le C^2 t \int e^{-t|z|(2/a-1)}\Big [1+\big |\log (t|z|)\big |^{m-1}\Big ] \Big [1+\big |\log (t|z|)\big |^{d-m-1}\Big ]\,\lambda (dz). \end{aligned}$$

This is of the order \({\mathcal {O}}(\log ^{d-1} t)\) by considering all summands separately and following the proof of [3, Lemma 3.2].

In this setting, \(h_i(y)=t|y|\) for all i and \({\tilde{h}}(y)=t|y|\). Further terms can be calculated as follows:

$$\begin{aligned} T_3&=t\int e^{-t|x|}\,\lambda (dx)=\sigma _t^2,\\ T_4&\le 5t^2\int |y|e^{-t|y|}\,\lambda (dy)\le 20\sigma _{t/2}^2,\\ T_5&\le 8 t^3\int |y|^2e^{-t|y|}\,\lambda (dy)\le 216 \sigma _{t/3}^2, \end{aligned}$$

and the terms involved in the bound on the Kolmogorov distance are

$$\begin{aligned} T_6&= \Big (\int t^3|y|^2(1+t|y|) e^{-t|y|} \,\lambda (dy)\Big )^{1/2} \le (27\sigma _{t/3}^2+256\sigma _{t/4}^2)^{1/2},\\ T_7&= \Big (t\int e^{-t|x|}\,\lambda (dx)\Big )^{1/2}=\sigma _t,\\ T_9&= \Big (\int \big [3t^2+3|y|t^3+2|y|^2t^4\big ]|y| e^{-t|y|}\,\lambda (dy)\Big )^{1/2} \le (12\sigma _{t/2}^2+81\sigma _{t/3}^2+512\sigma _{t/4}^2)^{1/2}. \end{aligned}$$

Noticing that \(\sigma _t^2={{\,\mathrm{{\mathbb Var}}\,}}\varvec{\delta }(G)\) behaves like \(\log ^{d-1} t\), we obtain from Proposition 7.1 that

$$\begin{aligned} \max \Big (d_W(\sigma _t^{-1}\varvec{\delta }(G),N), d_K(\sigma _t^{-1}\varvec{\delta }(G),N)\Big ) ={\mathcal {O}}(\sigma _t^{-1}). \end{aligned}$$