Normal approximation of Kabanov–Skorohod integrals on Poisson spaces

We consider the normal approximation of Kabanov–Skorohod integrals on a general Poisson space. Our bounds are for the Wasserstein and the Kolmogorov distance and involve only diﬀerence operators of the integrand of the Kabanov–Skorohod integral. The proofs rely on the Malliavin–Stein method and, in particular, on multiple applications of integration by parts formulae. As examples, we study some linear statistics of point processes that can be constructed by Poisson embeddings and functionals related to Pareto optimal points of a Poisson process.


Introduction
Let η be a Poisson process on a measurable space (X, X ) with a σ-finite intensity measure λ, defined on some probability space (Ω, F , P). Formally, η is a point process, which is a random element of the space N of all σ-finite measures on X with values in N 0 ∪ {∞}, equipped with the smallest σ-field N making the mappings µ → µ(B) measurable for each B ∈ X .The Poisson process η is completely independent, that is, η(B 1 ), . . ., η(B n ) are independent for pairwise disjoint B 1 , . . ., B n ∈ X , n ∈ N, and η(B) has for each B ∈ X a Poisson distribution with parameter λ(B), see e.g.[7,14].
Let G : N × X → R be a measurable function which is square integrable with respect to P η ⊗ λ, where P η := P(η ∈ •) denotes the distribution of η.In this paper, we study the Kabanov-Skorohod integral (short: KS-integral) of G defined as a Malliavin operator.If G is in the domain of the KS-integral and integrable with respect to P η ⊗λ, its KS-integral is pathwise given by where δ x stands for the Dirac measure at x ∈ X, see e.g.[10,Theorem 6].In this case, the Mecke formula immediately yields that Eδ(G) = 0. We refer to [10] for an introduction to stochastic calculus on a general Poisson space.The pathwise representation (1.1) of the KS-integral consists of two terms.The first term is the sum of the values G x (η − δ x ) over the points of η.Such sums have been intensively studied.The state of the art of limit theorems for such sums is presented in [9], based on the idea of stabilisation.The stabilisation property means that the functional G x (η −δ x ) depends only on points of η within some finite random distance from x, with conditions imposed on the distribution of such a distance.As in [9], we use recent developments of the Malliavin-Stein technique for Poisson processes, first elaborated in [15] and then extended in [5,8,13,22].
In all above mentioned works, the sums over Poisson processes are centred by subtracting the expectation, which is In contrast, the centring involved in the pathwise construction of the KS-integral in (1.1) is random.As shown in [12], KS-integrals naturally appear in the construction of unbiased estimators derived from Poisson hull operators.
In this paper we derive bounds for the Wasserstein and the Kolmogorov distance between δ(G) and a standard normal random variable.Limit theorems for compensated stochastic Poisson integrals in the Wasserstein distance have been studied in several papers by N. Privault, assuming that X is the Euclidean space R d with separate treatments of the cases d = 1 in [20] and d ≥ 2 in [19].In [20] the integrand is assumed to be adapted and in [19] it is assumed to be predictable and to have bounded support.In particular, the stochastic integral coincides in both cases with the KS-integral.Under these assumptions, the tools, based on derivation operators and Edgeworth-type expansions, have resulted in bounds involving integrals of the third power of G and differential operators applied to G.In comparison, our results apply to a general state space, are not restricted to predictable (or adapted) integrands, and do not assume the support of the integrand to be bounded in any sense.Furthermore, our bounds are given in terms of difference operators directly applied to the integrand G, and are derived for both the Wasserstein and the Kolmogorov distance.However, our bounds contain the integral of the absolute value of G to power 3, which may be larger than the corresponding term in [19].Our results are used in [12] to derive quantitative central limit theorems.
Let us compare our proof strategy with the standard approach for the normal approximation of Poisson functionals via the Malliavin-Stein method, which goes back to [15] and is also employed in [5,13,8,22].To this end, we omit all technical assumptions and definitions (some will be given later).Let F be a Poisson functional (a measurable function of η) and let f be the solution of the associated Stein equation.The identity δD = −L, where D is the difference operator and L is the Ornstein-Uhlenbeck generator with its inverse L −1 , and integration by parts lead to This step comes for the price of the term D x L −1 F , which is often difficult to evaluate and whose treatment is one of the main achievements of [13].For the special case F = δ(G) the identity δD = −L is not required.Instead, an immediate integration by parts yields that avoiding the inverse Ornstein-Uhlenbeck generator.We treat the KS-integrals that arise from the Taylor expansion of D x f (δ(G)) also by integration by parts, so that our final bounds only involve G and its difference operators but no KS-integrals.This is a difference to [26], where the argument in (1.2) is used but no further integration by parts.Even though our proofs differ from previous works, one may wonder whether existing Malliavin-Stein bounds can be applied to δ(G).As they do not involve the inverse Ornstein-Uhlenbeck generator, the results from [13] seem to be the best ones for the offthe-shelf use.They require only moments of the first and the second difference operator of the Poisson functional F , which one might also encounter when evaluating the bounds from [5,8,15,22].In our case, this means that one has to control moments like E G 4 x , E δ(D x G) 4 and E δ(D 2 x,y G) 4 for x, y ∈ X.Since we aim for bounds in terms of G and its difference operators, one has to remove the KS-integrals.This can be achieved by fourfold integration by parts, but would lead to normal approximation bounds that are more involved than in the current paper and include even iterated integrals with roots of the inner integrals.We expect these results to yield the same rates of convergence as our approach but under stronger integrability assumptions.Instead, our approach is direct and leads to much simpler calculations.In particular, it does not require the computation of expressions involving powers of KS-integrals apart from second moments.
Section 2 presents our main results, which are proved in Sections 4 and 5 separately for the Wasserstein and Kolmogorov distances, after recalling necessary results and constructions from stochastic calculus on Poisson spaces in Section 3. We conclude with two examples in Sections 6 and 7 concerning some linear statistics of point processes constructed via Poisson embeddings and Pareto optimal points.

Main results
To state our results we need to introduce some notation.The Wasserstein distance between the laws of two integrable random variables X and Y is defined by where Lip(1) denotes the space of all Lipschitz functions h : R → R with a Lipschitz constant at most one.The Kolmogorov distance between the laws of X and Y is given by Given a function f : N → R and x ∈ X, the function D x f : N → R is defined by Then D x is known as the difference operator.Iterating its definition yields, for given x, z, w ∈ X, the second difference operator D 2 x,z and the third difference operator D 3 x,z,w which can again be applied to functions f as above.For a function G : N × X → R (which maps (µ, y) to G y (µ)) and x, z, w ∈ X we let D x , D 2 x,z and D 3 x,z,w act on G y (•) so that it makes sense to talk about D x G y (µ), D 2 x,z G y (µ) and D 3 x,z,w G y (µ).Throughout the paper, we write shortly G y for G y (η) and similarly for difference operators.
We shall require the following integrability assumptions: In order to deal with the Kolmogorov distance, we also need to assume that (2.9) The following main result on the normal approximation of δ(G) involves only the integrand G and its first, second and third order difference operators.Throughout the paper we let N denote a standard normal random variable.Define and denote ) and (2.5).Assume also that Eδ(G) 2 = 1.Then (2.10) If, additionally, (2.7), (2.8) and (2.9) are satisfied, then We say that the functional G satisfies the cyclic condition of order two if see [18], where such conditions were used to simplify moment formulae for the KS-integral.Note that (2.12) always holds if the functional G is predictable, that is, the carrier space is equipped with a strict partial order ≺ and G y (η) depends only on η restricted to {x ∈ X : x ≺ y}.If (2.12) holds, then also In view of this, under the cyclic condition, the bounds from Theorem 2.1 simplify as follows.
Corollary 2.2.Assume that the cyclic condition (2.12) holds, and the assumptions of Theorem 2.1 are maintained.Then the bounds (2.10) and (2.11) hold with T 2 = T 8 = 0, and [14,Chapter 12]).In this case, Theorem 2.1 yields the classical Stein bounds for the Wasserstein and the Kolmogorov distance, and (with π = 1 there) provide bounds on the Wasserstein distance between a KS-integral and a standard normal random variable.In contrast to our main results, the bound in Theorem 3.1 from [26] contains still KS-integrals as integration by parts is employed only once.Proceeding there with further integrations by parts might be challenging since one of the KS-integrals is within an absolute value.The bound on the Wasserstein distance presented in Theorem 3.1 is evaluated in Corollary 3.2, but the resulting bound might not always behave as desired for a limit theorem.The first term on the right-hand side can be bounded from below by which does not become small if the KS-integral has variance one and the second term in (2.6) has a non-vanishing contribution (see Example 6.5 for such a situation).The third term contains only a product of two factors, which could be not sufficient if one rescales by the standard deviation of the KS-integral (see e.g. the situation discussed in Remarks 6.1 and 6.4 under the additional assumptions that u is constant and ϕ is translation invariant in its first argument).
Remark 2.5.In view of the works [16,23], we expect that our results can be extended to the multivariate normal approximation of vectors of KS-integrals for distances based on smooth test functions and for the so-called d convex -distance under suitable assumptions.

Preliminaries
In this section we provide some basic properties of the difference operator D and the KS-integral δ.First of all, we recall from [10] the definitions of D and δ as Malliavin operators.These definitions are based on n-th order Wiener-Itô integrals I n , n ∈ N; see also [14,Chapter 12].For symmetric functions f ∈ L 2 (λ n ) and g ∈ L 2 (λ m ) with n, m ∈ N we have We use the convention where we recall our (somewhat sloppy) convention H ≡ H(η), and where h 0 = EH and the h n , n ∈ N, are symmetric elements of L 2 (λ n ).Here and in the following, we mean by series of Wiener-Itô integrals their L 2 -limit, whence all identities involving such sums hold almost surely.Then H is in the domain dom D of the difference operator D (in the sense of a Malliavin operator) if In this case one has for measurable H, H ′ : N → R.
Now let G : N × X → R be a measurable function such that G x ≡ G(•, x) ∈ L 2 (P η ) for λ-a.e.x.Then there exist measurable functions g n : One says that G is in the domain dom δ of the KS-integral δ if where gn : X n+1 → R is the symmetrisation of g n .In this case the KS-integral of G is defined by  Proof.The proof is essentially that of Lemma 2.3 in [22].For the convenience of the reader we provide the main arguments.Since H ∈ L 2 (P η ), we can represent H as in (3.2).Similarly, we can write where the measurable functions h ′ n : X n+1 → R are in the last n coordinates symmetric and square integrable with respect to λ n .In fact, it follows from [14,Theorem 18.10] that we can choose Combining this with (3.4) and (3.1), we obtain for λ-a.e.x.The Cauchy-Schwarz inequality (applied twice) yields Since EH 2 < ∞, the first factor on the above right-hand side is finite.By assumption (2.3), the second factor is finite as well; see the proof of [10,Theorem 5].Hence (3.8) holds.The remainder of the proof is as in [22].
Lemma 3.2.Suppose that G satisfies (2.2) and (2.3), and let H : N → R be a measurable function satisfying Proof.If H is bounded, then (3.10) follows from Lemma 3.1.In the general case we set H r := max{min{H, r}), −r} for r > 0. Then (3.10) holds with H r instead of H. Hence, the observation that By (3.9) we can conclude the assertion from dominated convergence.
x. Representing G as in (3.4) and using [10, Theorem 3] twice, we can write By the L 2 -convergence of the right-hand side and (3.1), we obtain By assumption (2.4) this is finite, which is equivalent to In view of (3.5) and the inequalities (a consequence of Jensen's inequality), this yields that δ(G) ∈ dom D. Let G ′ be another measurable function satisfying (2.2) and (2.3).It follows from (3.6) and the polarisation identity that The integration by parts formula (3.7) yields that Assumptions (2.3) and (2.4) show that D x G ∈ dom δ for λ-almost all x and that δ(D • G) belongs to L 2 (P ⊗ λ) (see (3.6) and the discussion before it).Therefore, we obtain from Fubini's theorem and integration by parts that where we could apply Fubini's theorem on the left-hand side due to (2.3) and on the right-hand side by the Cauchy-Schwarz inequality and the square integrability of G ′ and δ(D • G).Inserting these two results into (3.12)yields Since the class of functions G ′ with the required properties is dense in L 2 (P η ⊗ λ) (see e.g. the proof of [10, Theorem 5]), we conclude the asserted formula (3.11).
4 Proof for the Wasserstein distance in Theorem 2.1 Our proof is similar to the proofs of Theorems 1.1 and 1.2 in [13] and relies on the ideas already present in [15].The first step is to recall Stein's method.Let C 1,2 be the set of all twice continuously differentiable functions g : R → R whose first derivative is bounded in absolute value by 1 and the second derivative by 2. Then we have for an integrable random variable X that Let the function G satisfy the assumptions of Theorem 2.1 and write X := δ(G).By the definition of the KS-integral we can write X ≡ X(η) as a measurable function of η.Let g ∈ C 1,2 .Then we have for λ-a.e.x ∈ X and a.s. that Since g is Lipschitz (by the boundedness of its first derivative) and X ∈ dom D by Lemma 3.3, it follows that |D x g(X)| ≤ |D x X|, so that Dg(X) (considered as a function on N × X) is square integrable with respect to P η ⊗ λ.Since, moreover, it is clear that g(X) is square integrable, we have in particular that g(X) ∈ dom D. The integration by parts formula (3.7) yields that Since G ∈ L 2 (P η ⊗ λ) and X ∈ dom D, we obtain from the Lipschitz continuity of g and the Cauchy-Schwarz inequality that We have that Our assumptions on G allow to apply the commutation rule (3.11) to D x X, yielding a.s. and for λ-a.e.x that In view of |g ′ | ≤ 1, (3.11), (2.2) and ( 4.3), we can note that We obtain Since Eδ(G) 2 = 1, Jensen's inequality and (3.6) yield that It follows from the Poincaré inequality (see [14,Section 18.3]) that We now turn to U 1 .We note first that, by |g ′ | ≤ 1 and (2.2), Because of for x ∈ X and s ∈ [0, 1], we have that where we have used the commutation rule (3.11) in the last step.To justify the linearity of the integration we can assume without loss of generality that and use that |g ′′ | ≤ 2. The latter inequality yields that |H(s, x)| ≤ 2s and To treat the term (4.7) we first use and the preceding integrability properties to conclude that Therefore, we obtain from Fubini's theorem that The expectation on the above right-hand side can be bounded with Lemma 3.2 applied to H := G 2 x H(s, x) and with D x G instead of G (justified by (2.3), (2.4), and (4.8)).This gives where we used Now we turn to the term U 2 .Define R x := 1 0 g ′ (X + sD x X) ds, x ∈ X.By the integrability property (4.4) and Fubini's theorem, By Lemma 3.1, whose assumptions are satisfied for λ-a.e.x by (2.2)-(2.4)and |g ′ | ≤ 1, and the product rule (3.3), Here, the expectations exist for λ 2 -a.e.(x, y) because of |g ′ | ≤ 1, (2.2) and (2.3).In view of the definition of T 4 we can assume without loss of generality that The commutation rule (3.11) leads to The following computation as well as (2.3) and (2.4) allow us to apply Lemma 3.2 to the second term on the right-hand side.From the commutation rule (3.11), the boundedness of g ′ and g ′′ , (4.9) and ( 2.3) we obtain Thus, we derive from Lemma 3.2 and where we used (3.3) in the last step.Similarly as in (4.1), we derive By assumptions (2.2)-(2.5)we can use the commutation rule (3.11) twice to obtain that a.s. and for λ 2 -a.e.(x, y), while D y X = G y + δ(D y G) a.s. and for λ-a.e.y.Therefore, (4.10) equals Because of the assumption T 4 < ∞, this yields Together with (2.2) and (2.3), we deduce from (4.11) that for λ 2 -a.e.(x, y).Hence, we have shown that By Lemma 3.2, which can be applied due to (4.12), the second term on the right-hand side can be further bounded by Combining the previous bounds, we see that which together with (4.5) completes the proof.
5 Proof for the Kolmogorov distance in Theorem 2.1 We prepare the proof of the second part of Theorem 2.1 by two lemmas.Since we consider iterated KS-integrals in the following, we indicate the integration variable as a subscript, i.e., write δ x to denote the KS-integral with respect to x. Lemma 5.1.Let h : N × X 2 → R be measurable and such that ) is well defined and x,y H ∈ L 2 (P η ) for λ 2 -a.e.(x, y) and ) x,y Hh(x, y) λ 2 (d(x, y)) = E δ x (δ y (h(x, y)))H .
Proof.First, let us assume that all KS-integrals are well defined.By applying iteratively [13, Corollary 2.4] and (3.11), we have Since, by (5.1), the right-hand side is finite, all involved KS-integrals are well defined by [13, Proposition 2.3].
For λ-a.e.x our assumptions imply D x H ∈ L 2 (P η ), D 2 x,y H ∈ L 2 (P η ) for λ-a.e.y as well as Thus, it follows from Lemma 3.1 that Since H ∈ L 2 (P η ), D x H ∈ L 2 (P η ) for λ-a.e.x and combining (5.1) and [13,Corollary 2.4] as in the proof of part (i) yields a further application of Lemma 3.1 leads to which concludes the proof of part (ii).
For a ∈ R, let f a be a solution of the Stein equation where Φ is the distribution function of the standard normal distribution.Note that f a is continuously differentiable on R \ {a}.Thus, we use the convention that f ′ a (a) is the left-sided limit of f ′ a in a.For the following lemma we refer the reader to [4, Lemma 2.2 and Lemma 2.3].Lemma 5.2.For each a ∈ R there exists a unique bounded solution f a of (5.3).This function satisfies: for all u ∈ R. Now we are ready for the proof for the Kolmogorov distance.It combines the approach for the Wasserstein distance with arguments from [8], which refined ideas previously used in [5] and [22].Indeed, for the normal approximation of Poisson functionals in Kolmogorov distance the Malliavin-Stein method was first used in [22].One of the terms in the bound was removed in [5] and two more in [8].The innovation of [8], which was inspired by the proof of Theorem 2.2 in [24] and which we also employ in the following, is to exploit the monotonicity of u → uf a (u) and u → 1{u ≤ a}.
Proof for the Kolmogorov distance in Theorem 2.1.Throughout the proof we can assume without loss of generality that T 1 , T 2 , T 6 , T 7 , T 8 , T 9 < ∞.Let a ∈ R, and let f a be the solution of (5.3) from Lemma 5.2.For X := δ(G) we have f a (X) ∈ dom D (since |f ′ a | ≤ 1 and X ∈ dom D), whence the integration by parts rule (3.7) yields similarly as in (4.2) that Together with we obtain where the decomposition into I 1 and I 2 is allowed due to |f ′ a | ≤ 1 and (4.3).The commutation rule (3.11) yields From Fubini's theorem, which is applicable because of |f ′ a | ≤ 1 and (4.3), and Lemma 3.1 it follows that The use of Lemma 3.1 is justified by f ′ a (X)G x ∈ L 2 (P η ) for λ-a.e.x and 2) and (2.3), as well as (2.3) and (2.4).From (3.3) we derive Combining this with |f ′ a | ≤ 1, (2.3) and (2.7), we see that (5.4) By Fubini's theorem, this makes it possible to rewrite I 1 as It follows, as in the proof for the Wasserstein distance, that As shown in (5.4), we can apply Fubini's theorem to I 1,2 , so that 2) and (2.3) because of T 6 < ∞.Thus, Lemma 3.1 shows that Together with |f ′ a | ≤ 1 and Jensen's inequality, we obtain that In the sequel, we focus on I 2 .By (5.3), the inner integral in Since u → uf a (u) is non-decreasing (see Lemma 5.2 (i)) and u → 1{u ≤ a} is nonincreasing, we derive by considering the cases D x X ≥ 0 and D x X < 0 separately that DxX 0 and Combining these estimates with (3.11) leads to The decomposition into two integrals on the right-hand side is allowed as can be seen from the following argument.From Lemma 5.2 (ii) we know that |uf a (u) − 1{u ≤ a}| ≤ 2 for all u ∈ R. (5.5) Together with (2.2), we see that It follows from (5.5), the Cauchy-Schwarz inequality, [13, Corollary 2.4] and (2.2)-(2.4)that Thus, the integrals I 2,1 and I 2,2 are well defined and finite.Moreover, we can interchange expectation and integration in I 2,1 and I 2,2 by Fubini's theorem.We deduce from (5.5) for Z := Xf a (X) − 1{X ≤ a} that |Z| ≤ 2, |D x Z| ≤ 4 for λ-a.e.x and |D 2 x,y Z| ≤ 8 for λ 2 -a.e.(x, y). (5.6) Note that E G 4 x λ(dx) < ∞ since T 7 < ∞.Together with (2.8), we see that X ∋ x → G x |G x | satisfies the integrability conditions (2.2) and (2.3) and that G|G| ∈ dom δ.Thus, Lemma 3.1 with G replaced by G|G| implies The decomposition of I 2,2 into two integrals is justified since it follows from (5.5), (2.3) and (2.7) that and (5.8) Note that h(x, y) ) is well defined by Lemma 5.1 (i).Together with (5.6) and (5.7) it follows from Lemma 5.1 (ii) that Because of T 8 < ∞ we see that and recall (2.9), whence X ∋ x → D x G y D y |G x | λ(dy) satisfies the integrability assumptions (2.2) and (2.3) and belongs to dom δ.By (5.6), (5.8), Fubini's theorem and Lemma 3.1, We have shown that Now (5.5) and Jensen's inequality yield that and that . By (3.6), we have and From Lemma 5.1 (i), whose assumptions are satisfied due to T 9 < ∞, it follows that which completes the proof.

Poisson embedding
In this section we consider a Poisson process η on X := R d × R + , whose intensity measure λ is the product of the Lebesgue measure λ d on R d and the Lebesgue measure λ + on R + .We fix a measurable mapping ϕ : where the value ∞ is allowed for technical convenience.Then is a point process on R d .(At this stage it might not be locally finite.)Let u : R d → R be a measurable function, and define G : Under suitable integrability assumptions we then have This can be interpreted as integral of u with respect to the compensated point process ξ.
To make the dependence on u more visible, we abuse our notation and write δ(u) := δ(G), whenever this integral is defined pathwise.
Under certain assumptions, it can be expected that the standardised δ(u) is getting close to a normal distribution.To establish an asymptotic scenario, we take a Borel set We are interested in the normal approximation of δ(u B ) for B of growing volume.Remark 6.1.Assume that d = 1 and that ϕ is predictable, that is, ϕ(t, µ) = ϕ(t, µ t− ), where µ t− is the restriction of µ ∈ N to (−∞, t) × R + .Then, under suitable integrability assumptions (satisfied under our assumptions below) ξ([0, t]) − t 0 ϕ(s, η) ds t≥0 is a martingale with respect to the filtration (σ(η (−∞,t]×R + )) t≥0 ; see e.g.[11].Therefore, (ϕ(t, •)) t≥0 is a stochastic intensity of ξ (on R + ) with respect to this filtration.Take B = [0, T ] for some T > 0 and write u T := u B .Then (δ(u T )) T ≥0 is a martingale.Theorem 3.1 from [25] provides a quantitative central limit theorem in the Wasserstein distance for δ(u T ).Below we derive a similar result using our tools, not only for the Wasserstein but also for the Kolmogorov distance.It should be noted that predictability and martingale properties are of no relevance for our approach.All what matters is that δ(u B ) is a KS-integral with respect to the Poisson process η.
Before stating some assumptions on ϕ, we introduce some useful terminology.A mapping Z from N to the Borel sets of X is called graph-measurable if (µ, s, x) → 1{(s, x) ∈ Z(µ)} is a measurable mapping.Given such a mapping, we define a whole family of Z t , t ∈ R d , of such mappings by setting where θ t µ := 1{(r − t, z) ∈ •} µ(d(r, z)) is the shift of µ by t in the first coordinate, and We assume that there exists a graph-measurable Z such that Here, we denote by ν A the restriction of a measure ν to a Borel set A of X. Next, we assume that there exists a measurable mapping Y : Eλ(Z 0 ∩ Z s ) 4 1/4 ds < ∞, (6.4) ) ) It follows from Fubini's theorem, Hölder's inequality and (6.5) that Eλ(Z 0 ) 4 < ∞.Assumptions (6.3) and (6.7) justify that δ(u B ) is defined pathwise if u is bounded.Moreover, we will see below that our assumptions imply that (2.2) and (2.3) hold.Therefore, G B is in the domain of the KS-integral.
For the normal approximation of δ(u B ) we have the following result.Theorem 6.3.Let ϕ : R d × N → [0, ∞] be measurable, and let Z be a graph measurable mapping from N to the Borel sets of X. Assume that (6.2)-(6.7)are satisfied.Let u : R d → R be measurable and bounded, and let B ⊂ R d be a Borel set with λ d (B) < ∞.Finally, assume that σ 2 B := Var(δ(u B )) > 0. Then there exists a constant c > 0, not depending on B, such that Proof.We apply Theorem 2.1 with G B /σ B in place of G.For notational simplicity we omit the subscript B of G B .We need to bound the terms T i for i ∈ {1, . . ., 9}.The assumptions of Theorem 2.1 are checked at the end of the proof.For simplicity, assume that |u| is bounded by 1.The value of a constant c might change from line to line.We often write D s,x instead of D (s,x) .The term where the second inequality follows from assumptions (6.3) and (6.7).Here and later we often use that θ s η and η have the same distribution for each s ∈ R d , whence Y s has the same distribution for all s ∈ R d and the same holds for λ(Z s ).We deduce from (6.2) that, for (s, x) ∈ X, (t, y) / ∈ Z s and ν ∈ N with ν(X) ≤ 2, whence the first three difference operators of 1{x ≤ ϕ s } vanish if one of the additional points is outside of Z s .From (6.3) we see that 1{x ≤ ϕ s } and its first three difference operators become zero if x > Y s .In the following, these observations are frequently used to bound difference operators in terms of indicator functions.
First we consider T 1 .Writing the square of the inner integral as a double integral, we have By the discussed behaviour of the difference operators, where we have used Hölder's inequality.By (6.7), EY 3 s = EY 3 r = EY 3 0 < ∞.Moreover, Therefore, where we have used assumption (6.4) (and the monotonicity of L p -norms).Hence, , as required by (6.9).For the term T 2 , we have The inner integrand does only contribute if (s, x) ∈ Z t , (t, y) ∈ Z s , and (r, z) ∈ Z t or (r, z) ∈ Z s .Since the last two cases are symmetric, T ′ 2 can be bounded by By Fubini's theorem, By definition of Z t and Z s and the distributional invariance of η, Changing variables yields that where Since we obtain from assumption (6.6) that b < ∞.Hence, where we have used assumption (6.4).Each of the summands in the term T ′ 4 := σ 3 B T 4 includes the factor D s,x 1{y ≤ ϕ t }, so that For T ′ 5 := σ 3 B T 5 , we have where in the second term of T 5 we renamed x as y and vice versa.This leads to the upper bound We can rewrite T ′ 6 := σ 4 B T 2 6 as sum of T ′ 6,1 and T ′ 6,2 with and For T ′ 7 := σ 4 B T 2 7 , the first term can be bounded as T ′ 3 , while the second term is bounded by is an integral with respect to i points for i ∈ {1, 2, 3}.The term T ′ 9,1 can be bounded by (6.10), while For T ′ 9,3 we deduce the bound which can be treated similarly as in the computation for T ′ 9,2 but with the power 4. Finally, we check the assumptions of Theorem 2.1.The expression in (2.2) can be treated as T ′ 3 , while (2.3), (2.7) and (2.8) can be bounded as T ′ 4 .Similarly, we can verify (2.4), (2.5) and (2.9) by using the computations for T ′ 9,2 , T ′ 9,3 and T ′ 6,2 , respectively.Remark 6.4.Theorem 6.3 can be used to establish central limit theorems.Consider, for instance, the setting of Remark 6.1.Two possible choices of Z t are provided in Example 6.2.Since ϕ is assumed to be predictable in Remark 6.1, the cyclic condition (2.12) is satisfied and (2.6) simplifies to It is natural to assume that σ 2 T ≥ cT for some c > 0 and all sufficiently large T .If, additionally, the assumptions of Theorem 6.3 are satisfied, then (6.9) shows that for some c ′ > 0 and all sufficiently large T .It does not seem to be possible to derive the Wasserstein part of this bound from [25,Theorem 3.1]; see also [6,Remark 3.8].
The reason is that the third term on the right-hand side of [25, (3.9)] does not have the appropriate order.
Example 6.5.Let h : R d → R + be a measurable function satisfying (h(s)+h(s) 2 ) ds < ∞.Define Z := {(s, x) ∈ R d × R + : x ≤ h(s)} and Z t := Z + t, t ∈ R d .We interpret Z and Z t as constant mappings on N and check that (6.4)-(6.6)are satisfied.For (6.4) we note that Since h is square integrable, we have 1{y ≤ h(s)} ds ≤ cy −2 for some c > 0, so that the above integral is finite.Relation (6.5) follows at once from the integrability of h, while the left-hand side of (6.6) is bounded by h(s) 2 ds.Assume now that the function ϕ satisfies Then (6.2) holds.Assumptions (6.3) and (6.7) depend on the choice of ϕ.They are satisfied, for instance, if ϕ(t, •) is a polynomial or exponential function of µ(Z t ).
Assume that u and Eϕ(•, η Z ) have a lower bound c > 0 and that ϕ(s, •) is for all s ∈ R d either increasing or decreasing when adding a point.Then Theorem 6.3 yields a (quantitative) central limit theorem for λ d (B) → ∞.To this end, we need to find a lower bound for σ 2 B , given by (2.6).In our case the first term on the right-hand side of (2.6) equals and has the lower bound The second term is given by E 1{s, t ∈ B}u(s)u(t)D t,y 1{x ≤ ϕ(s, η)}D s,x 1{y ≤ ϕ(t, η)} d(s, x, t, y).
By the monotonicity assumption on ϕ and u ≥ c, this is non-negative.
Example 6.6.For a point configuration µ ∈ N and w ∈ X the Voronoi cell of w is given by V (w, µ) := {v ∈ X : w − v ≤ w ′ − v for all w ′ ∈ µ}, i.e., V (w, µ) is the set of all points in X such that no point of µ is closer than w.The cells (V (w, µ)) w∈µ have disjoint interiors and form a tessellation of X, the so-called Voronoi tessellation, which is an often studied model from stochastic geometry (see e.g.[21,Section 10.2]).From the Poisson-Voronoi tessellation (i.e., the Voronoi tessellation with respect to η) we construct the point process This point process has the following geometric interpretation.We take all cells of the Poisson-Voronoi tessellation that intersect R d × {0}, which one can think of as the lowest layer of the Poisson-Voronoi tessellation, and the first coordinates of their nuclei are the points of ξ.The points of ξ build the projection of a one-sided version of the Markov path considered in [1].
First we check that ξ can be represented as in (6.1).For s ∈ R d , x 1 , x 2 ∈ R + with x 1 < x 2 and µ ∈ N we have (6.12) If V ((s, 0), µ) is bounded, which is for P η -a.e.µ the case, there exists a unique ) is exactly a single point.This allows us to rewrite ξ as which is the maximal distance from (s, 0) to a point of its Voronoi cell.Note that V ((s, 0), µ) is completely determined by the points of µ in B((s, 0), 2R(s, µ)), the closed ball in X with radius 2R(s, µ) around (s, 0).Indeed, the centres of all neighbouring cells to the Voronoi cell of (s, 0) are within this ball and all other points of η outside are too far away to affect the cell.If we consider V ((s, x), µ) ∩ (R d × {0}) as a function of x, for increasing x the sets V ((s, x), µ) ∩ (R d × {0}) are not increasing (see (6.12)) and (V ((s, 0), µ) ) is also completely determined by the points in B(s, 2R((s, 0), µ)).Hence, we can conclude that ϕ(s, µ) = ϕ(s, µ B((s,0),2R(s,µ)) ).
For (6.6) we can bound P((s, x) ∈ B((0, 0), 2R(0, η)), (0, y) ∈ B((s, 0), 2R(s, η))) by the Cauchy-Schwarz inequality and then bound the resulting integral.This yields that the conclusions of Theorem 6.3 hold for the point process ξ from (6.11).Since ϕ is non-increasing with respect to additional points, one can argue as in the previous example to see that there is a lower bound for the variance of order λ d (B) if u > c 0 for some c 0 > 0. This yields a (quantitative) central limit theorem as λ d (B) → ∞.

Functionals generated by a partial order
In this section we return to the setting of a general σ-finite measure space (X, X , λ).In many situations, the functional G x can be written as G x (µ) = f (x)H x (µ), where f ∈ L 2 (λ) and the functional H x (µ) is measurable in both arguments, takes values in {0, 1} and can be decomposed as Write shortly H x (y) instead of H x (δ y ), and denote H x (y) := 1 − H x (y).A generic way to construct such functionals is to consider a strict partial order ≺ on X and to set H x (y) := 1 − 1{y ≺ x}.The set of points x ∈ η such that H x (η) = 1 is called the set of Pareto optimal points with respect to the chosen partial order, i.e., x ∈ η is Pareto optimal if there exists no y ∈ η such that y ≺ x.For x / ∈ η, we have H x (η) = 1 if x is Pareto optimal in η + δ x .If δ(G) can be defined pathwise as in (1.1), then it equals the sum of the values of f over Pareto optimal points centred by the integral of f over the set of x such that H x (η) = 1.As shown in [12], such examples naturally arise in statistical applications.
It is easy to see by induction that In particular, By construction, H y (η) = 1 and H y (x) = 1 yield that H x (η) = 1, which can be expressed as The asymmetry property of the strict partial order implies that H x (y)H y (x) = 0 for all x, y ∈ X.Hence, the functional G satisfies the cyclic condition (2.12).Thus, the second term on the right-hand side of (2.6) vanishes.If (2.2) and (2.3) are satisfied, it follows from [13, Proposition 2.3] that the KS-integral δ(G) of G is well defined and In addition, property (7.1) leads to a considerable simplification of the terms arising in the bounds in Corollary 2.2.Write H x as a shorthand for H x (η), denote Proposition 7.1.Assume that G x (µ) = f (x)H x (µ), where f ∈ L 2 (λ) and the functional H is determined by (7.1) from a strict partial order on X.Then the terms T 2 and T 8 defined before Theorem 2.1 vanish and the other terms satisfy Suppose Var δ(G) > 0 and that (2.2)-(2.5)are satisfied.Then Var δ(G) 3 .
Proof.The expression for T 1 follows from G 2 x = f (x)G x for x ∈ X and (7.3), while T 3 results from the definition of G x .Now consider the further terms, appearing in Corollary 2.2.We rely on (7.2) with m = 2, 3, (7.3), and (7.5) in the subsequent calculations.First, where we used the fact that H z (y)H y (z) = 0 for all y and z as well as (7.4).This yields the sought bound for T 5 , taking into account that H z (y)H y (x) ≤ H z (y)H z (x).Next, T 6 = (T 6,1 + T 6,2 ) = f (y) 2 EH y H y (z)h 1 (y) 2 λ 2 (d(y, z)).
Example 7.2.Let X be the unit cube [0, 1] d with the Lebesgue measure λ.For x, y ∈ X, write y ≺ x if x = y and all components of y are not greater than the corresponding components of x.Let G x (µ) = H x (µ), with H x (µ) given by (7.1) and H x (y) := 1{y ≺ x}.Let η t be the Poisson process on X of intensity tλ.Then G x (η t ) = 1 means that none of the points y ∈ η t satisfies y ≺ x, that is, none of the points from η t is smaller than x in the coordinatewise order.In this case, x is said to be a Pareto optimal point in η t + δ x .Then δ(G) equals the difference between the number of Pareto optimal points in η t and the volume of the complement of the set of points x ∈ X such that y ≺ x for at least one y ∈ η t .
It is shown in [2] that the right-hand side is of order log d−1 t for large t.Note that the above formula gives also the expected number of Pareto optimal points.Quantitative limit theorems for the number of Pareto optimal points centred by subtracting the mean and scaled by the standard deviation were obtained in [3].Below we derive a variant of such result for the KS-integral, which involves a different stochastic centring.
Since G x (η) = f (x)H x (η) with the function f identically equal one and the measure λ is finite, the integrability conditions (2.2)-(2.5),and (2.7)-(2.9)are satisfied.The terms arising in Proposition 7.1 can be calculated as follows.First, Here and in what follows we use the inequality se −s ≤ 1 with s = t|y|, which yields that t i |y| i−1 e −t|y| λ(dy) ≤ t (t|y|e −t|y|/i ) i−1 e −t|y|/i λ(dy) ≤ i i σ 2 t/i , i ∈ N.