Edgeworth approximations for distributions of symmetric statistics

Bloznelis, Mindaugas; Götze, Friedrich

doi:10.1007/s00440-022-01144-x

Edgeworth approximations for distributions of symmetric statistics

Open access
Published: 17 June 2022

Volume 183, pages 1153–1235, (2022)
Cite this article

Download PDF

You have full access to this open access article

Probability Theory and Related Fields Aims and scope Submit manuscript

Edgeworth approximations for distributions of symmetric statistics

Download PDF

Mindaugas Bloznelis¹ &
Friedrich Götze²

1972 Accesses
Explore all metrics

Abstract

We study the distribution of a general class of asymptotically linear statistics which are symmetric functions of N independent observations. The distribution functions of these statistics are approximated by an Edgeworth expansion with a remainder of order $o(N^{-1})$. The Edgeworth expansion is based on Hoeffding’s decomposition which provides a stochastic expansion into a linear part, a quadratic part as well as smaller higher order parts. The validity of this Edgeworth expansion is proved under Cramér’s condition on the linear part, moment assumptions for all parts of the statistic and an optimal dimensionality requirement for the non linear part.

Asymptotic Analysis of Symmetric Functions

Article 25 March 2016

Stochastic Decompositions of Unbiased Estimators for Basic One-Parameter Distributions from the Exponential Family

Article 10 October 2017

Asymptotic approximations for some distributions of ratios

Article 22 March 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction and results

1.1 Introduction

Let $X,\, X_1,X_2,\dots , X_N$ be independent and identically distributed random variables taking values in a measurable space $(\mathcal X,\mathcal B)$. Let $P_X$ denotes the distribution of X on $(\mathcal X,\mathcal B)$. We assume that $\mathbb T(X_1,\dots , X_N)$ is a symmetric function of its arguments (symmetric statistic, for short). Furthermore, we assume that the moments $\mathbf{E}\mathbb T$ and $\sigma _{\mathbb T}^2:=\mathbf{Var}\mathbb T$ are finite. A function of observations $X_1,\dots , X_N$ is called linear statistic if it can be represented by a sum of functions depending on a single observation only. Many important statistics are non linear, but can be approximated by a linear statistic. We call these statistics asymptotically linear. The central limit theorem and the normal approximation with rate $O(N^{-1/2})$ extend to the class of asymptotically linear statistics as well. Our approach in studying the distribution of this class of statistics in the statistically relevant case of asymptotic normal $\mathbb T$ is based on Hoeffding’s decomposition of $\mathbb T$, see Hoeffding [31], Efron and Stein [21] and van Zwet [37]. Hoeffding’s decomposition expands $\mathbb T$ into the series of centered and mutually uncorrelated U-statistics of increasing order

$$\begin{aligned} {\mathbb T}&= \mathbf{E}{\mathbb T} + \frac{1}{N^{1/2}} \sum _{1\le i\le N}g(X_i) + \frac{1}{N^{3/2}}\sum _{1\le i<j\le N}\psi (X_i,X_j)\\&\quad \, + \frac{1}{N^{5/2}}\sum _{1\le i<j<k\le N}\chi (X_i,X_j,X_k) +\dots . \end{aligned}$$

Let L, Q and K denote the first, the second and the third sum. We call L the linear part, Q the quadratic part and K the cubic part of the decomposition. We shall consider a general situation where the kernel $\mathbb T=\mathbb T^{(N)}$, the space $(\mathcal X,\mathcal B)=(\mathcal X^{(N)},\mathcal B^{(N)})$ and the distribution $P_X=P_X^{(N)}$ all depend on N as $N\rightarrow \infty $. In order to keep the notation simple we drop the subscript N in what follows. An improvement over the normal approximation is obtained by using Edgeworth expansions for the distribution function $\mathbb F(x)=\mathbf{P}\{\mathbb T-\mathbf{E}\mathbb T\le \sigma _{\mathbb T}x\}$. For this purpose we write Hoeffding’s decomposition in the form

$$\begin{aligned} \mathbb T-\mathbf{E}\mathbb T=L+Q+K+R, \end{aligned}$$

(1)

where R denotes the remainder. For a number of important examples of asymptotically linear statistics we have $R/\sigma _{\mathbb T}=o_P(N^{-1})$ (in probability) as $N\rightarrow \infty $. Therefore, the U-statistic $\sigma _{\mathbb T}^{-1}(L+Q+K)$ can be viewed as a stochastic expansion of $(\mathbb T-\mathbf{E}\mathbb T)/\sigma _{\mathbb T}$ up to the order $o_P(N^{-1})$.

Furthermore, a so-called Edgeworth expansion of $\sigma _{\mathbb T}^{-1}(L+Q+K)$ can be used to approximate $\mathbb F(x)$ by a smooth distribution function G(x) as defined in (2) below depending on N and moments of $\mathbb T$. A two term Edgeworth expansion of the distribution function of $\sigma _{\mathbb T}^{-1}(L+Q+K)$ is given by

$$\begin{aligned} G(x)= & {} \Phi (x) -\frac{1}{\sqrt{N}}\frac{\kappa _3}{6}(x^2-1)\Phi '(x) \nonumber \\&-\frac{1}{N} \Bigl ( \frac{\kappa _3^2}{72}(x^5-10x^3+15x)\Phi '(x)+\frac{\kappa _4}{24}(x^3-3x)\Phi '(x) \Bigr ). \end{aligned}$$

(2)

Here $\Phi $ respectively $\Phi '$ denote the standard normal distribution function and its derivative. Furthermore, we introduce $\sigma ^2=\mathbf{E}g^2(X_1)$ and

$$\begin{aligned} \kappa _3=&\, \sigma ^{-3}\Bigl (\mathbf{E}g^3(X_1)+3\mathbf{E}g(X_1)g(X_2)\psi (X_1, X_2)\Bigr ), \\ \kappa _4 =&\,\sigma ^{-4}\Bigl (\mathbf{E}g^4(X_1)-3\sigma ^4+12\mathbf{E}g^2(X_1)g(X_2)\psi (X_1,X_2) \\&+12\mathbf{E}g(X_1)g(X_2)\psi (X_1,X_3)\psi (X_2,X_3) \\&+4\mathbf{E}g(X_1)g(X_2)g(X_3)\chi (X_1,X_2,X_3) \Bigr ). \end{aligned}$$

Our main result, Theorem 1 below, establishes a bound $o(N^{-1})$ for the Kolmogorov distance between $\mathbb F(x)$ and G(x):

$$\begin{aligned} \Delta =\sup _{x\in \mathbb R}|\mathbb F(x)-G(x)|= o(N^{-1}). \end{aligned}$$

(3)

Valid expansions of this type were shown by Cramér [19] for sums of independent random variables $X_j$ and later on for the Student statistic (which is of type (1)) by Kai-Lai Chung [18]. A new impetus for studying higher order approximations in statistic was given by the fundamental paper of Hodges and Lehmann on deficiency [30], where they compared the power of two tests based on N and $N'$ observations respectively and where $N'-N=o(N)$ as $N\rightarrow \infty $. They suggested a program of comparisons of the power of tests, estimators and confidence regions based on classical parametric and non parametric symmetric statistics e.g. using ranks and ordered samples. They noted that this would require going beyond Gaussian limit theorems to asymptotic expansions to order $N^{-1}$. For more details on the statistical relevance and the related development of asymptotic methods we refer to the review paper in memory of Willem van Zwet [12].

Now we discuss the principal contribution of this paper: the minimal smootness and structural conditions under which approximation (3) holds. Let us emphasize that any $\mathbb F$ satisfying (3) cannot have fluctuations/increments of order $\Theta (N^{-1})$ in the intervals of size $o(N^{-1})$ because G is a differentiable function with all derivatives bounded. We focus on the conditions that guarrantee the necessary level of smoothness of the distribution of $\mathbb T$. In the case of linear statistic $\mathbb T=\mathbf{E}\mathbb T+L$ the necessary smoothness of $\mathbb F$ is ensured by the classical Cramer condition

$$\begin{aligned} \limsup _{|t|\rightarrow \infty } |\mathbf{E}\exp \{it g(X_1)\}|<1. \qquad \qquad (C) \end{aligned}$$

This condition excludes, in particular, lattice distributions, for which approximation (3) obviously fails. We note that condition (C) can be weakened to cover some special classes of discrete distribution which are sufficiently non-lattice distributed, see e.g. Bickel and Robinson [13], Angst and Poly [1] or Bobkov [14] for almost sure choices of such non-lattice discrete distributions.

Since the class of symmetric statistic should include the case of linear statistics we require a Cramér type condition but on the linear part of the statistic only, see (7). Interestingly, this condition together with appropriate moment conditions on various parts of the decomposition (1) guarantees already an approximation error $\Delta =O(N^{-1})$ for general symmetric statistics (see [4]). But (7) is not sufficient for the desired error bound $o(N^{-1})$ even for U statistics of degree two, see Example 1 below. The reason why (7) alone is not sufficient for the approximation accuracy $\Delta =o(N^{-1})$ is due to the potential occurrence of a very special relation between the linear and quadratic parts L and Q that fosters an approximate lattice structure as shown in Example 1. Namely, the quadratic part of the U statistic in Example 1 has a factorizable kernel $\psi $ of the form $\psi _h(X_1,X_2)=h(X_1)g(X_2)+g(X_1)h(X_2)$, $h-$measurable. The following structural condition (4) (introduced in the unpublished manuscript by Götze and van Zwet [25]) avoids such counterexamples by separating (in $L^2$ distance) the random variable $\psi (X_1,X_2)$ from any random variable of the form $\psi _h(X_1,X_2)$. Note that the $L^2$ distance $\mathbf{E}(\psi (X_1,X_2)-\psi _h(X_1,X_2))^2$ is minimized by $h(x)=b(x)$, where

$$\begin{aligned} b(x)=\sigma ^{-2}\mathbf{E}\bigl (\psi (X_1,X_2)g(X_2)\bigr |X_1=x\bigr )-(\kappa /2\sigma ^4)g(x). \end{aligned}$$

Here $\kappa =\mathbf{E}\psi (X_1,X_2)g(X_1)g(X_2)$. Therefore, we will assume that, for some absolute constant $\delta _{*}>0$, we have

$$\begin{aligned} \mathbf{E}\Bigl (\psi (X_1,X_2)-\bigl (b(X_1)g(X_2)+b(X_2)g(X_1)\bigr )\Bigr )^2\ge \delta _{*}^2\sigma _\mathbb T^2. \end{aligned}$$

(4)

The main contribution of the present paper consist of a proof that condition (4) will indeed ensure the desired bound $\Delta =o(N^{-1})$. The proof is based on a careful investigation of the size distribution for $|t|> N^{1-\nu }$ of the absolute values of conditional Fourier transforms of symmetric statistics that is the landscape of its maxima when imposing Cramer’s condition (7) and the structural condition (4). Here new methods are used for studying this landscape in the frequency t as well in the random function representing the conditioning. For the latter variable a combinatorial argument of Kleitman on symmetric partitions for the Littlewood–Offord problem in Banach spaces (see [15]) is used.

A short outline of the approach is given at the beginning of Sect. 2, where we focus on the use of condition (4).

1.2 Results

Let us state our main result Theorem 1.

Moment conditions We will assume that, for some absolute constants $0<A_*<1$ and $M_*>0$ and numbers $r>4$ and $s>2$, we have

$$\begin{aligned}&\mathbf{E}g^2(X_1)>A_*\sigma _\mathbb T^2, \quad \mathbf{E}|g(X_1)|^r<M_*\sigma _\mathbb T^r, \nonumber \\&\mathbf{E}|\psi (X_1,X_2)|^r<M_*\sigma _\mathbb T^r, \quad \mathbf{E}|\chi (X_1,X_2,X_3)|^s<M_*\sigma _\mathbb T^s. \end{aligned}$$

(5)

These moment conditions refer to the linear, quadratic and cubic part of $\mathbb T$. In order to control the remainder R of the approximation (1) we use moments of differences introduced in Bentkus, Götze and van Zwet [4], see also van Zwet [37]. Define, for $1\le i\le N$,

$$\begin{aligned} D_i\mathbb T=\mathbb T-\mathbf{E}_i\mathbb T, \qquad \mathbf{E}_i\mathbb T:=\mathbf{E}(\mathbb T|X_1,\dots , X_{i-1},X_{i+1},\dots , X_N). \end{aligned}$$

A subsequent application of difference operations $D_i$, $D_j$, $\dots $, (the indices i, j, $\dots $, are all distinct) produce higher order differences, like

$$\begin{aligned} D_iD_j\mathbb T:=D_i(D_j\mathbb T)=\mathbb T-\mathbf{E}_i\mathbb T-\mathbf{E}_j\mathbb T+\mathbf{E}_i\mathbf{E}_j\mathbb T. \end{aligned}$$

For $m=1,2,3,4$ write $\Delta _m^2=\mathbf{E}|N^{m-1/2}D_1D_2\cdots D_m \mathbb T|^2$.

We will assume that for some absolute constant $D_*>0$ and number $\nu _1\in (0,1/2)$ we have

$$\begin{aligned} \Delta _4^2/\sigma _\mathbb T^2\le N^{1-2\nu _1}D_* \end{aligned}$$

(6)

For a number of important examples of asymptotically linear statistics the moments $\Delta _m^2$ are evaluated or estimated in [4]. Typically we have $\Delta _m^2/\sigma _\mathbb T^2=O(1)$ for some m. Therefore, assuming that (6) holds uniformly in N as $N\rightarrow \infty $, we obtain from the inequality $\mathbf{E}R^2\le N^{-3}\Delta _4^2$, see (167) (see “Appendix”), that $R/\sigma _\mathbb T=O_P(N^{-1-\nu _1})$. Furthermore, assuming that (5), (6) hold uniformly in N as $N\rightarrow \infty $, we obtain from (167), (166), see “Appendix”, that $\sigma ^2/\sigma _\mathbb T^2=(1-O(N^{-1}))$.

Cramér type smoothness condition We introduce the function

$$\begin{aligned} \rho (a,b)=1-\sup \{|\mathbf{E}\exp \{itg(X_1)/\sigma \}|: \, a\le |t|\le b \} \end{aligned}$$

and assume that, for some $\delta >0$ and $\nu _2>0$, we have

$$\begin{aligned} \rho (\beta _3^{-1}, N^{\nu _2+1/2}) \ge \delta . \end{aligned}$$

(7)

Here $\beta _3=\sigma ^{-3}\mathbf{E}|g(X_1)|^3$. Define $\nu =600^{-1}\min \{\nu _1,\nu _2,s-2,r-4\}$.

Theorem 1

Assume that for some absolute constants $A_*,M_*,D_*>0$ and numbers $r>4, s>2$, $\nu _1,\nu _2>0$ and $\delta ,\delta _{*}>0$, the conditions (5), (6), (7), (4) hold. Then there exists a constant $C_*>0$ depending only on $A_*$, $M_*$, $D_*$, r, s, $\nu _1,\nu _2,\delta , \delta _{*}$ such that

$$\begin{aligned} \Delta \le C_*N^{-1-\nu }\bigl (1+\delta _{*}^{-1}N^{-\nu }\bigr ). \end{aligned}$$

Remark 1

The value of $\nu =600^{-1}\min \{\nu _1,\nu _2,s-2,r-4\}$ is far from being optimal. Furthermore, the moment conditions (5) and (6) are not the weakest possible that would ensure the approximation of order $o(N^{-1})$. The condition (5) can likely be reduced to the moment conditions that are necessary to define Edgeworth expansion terms $\kappa _3$ and $\kappa _4$, similarly, (6) can be reduced to $\Delta _4^2/\sigma _\mathbb T^2= o(N^{-1})$. No effort was made to obtain the result under the optimal conditions. This would increase the complexity of the proof which is already rather involved.

Remark 2

Condition (4) can be relaxed. Assume that for some absolute constant $G_*$ we have

$$\begin{aligned} \mathbf{E}\Bigl (\psi (X_1,X_2)-\bigl (b(X_1)g(X_2)+b(X_2)g(X_1)\bigr )\Bigr )^2\ge N^{-2\nu } G_*\sigma _\mathbb T^2. \end{aligned}$$

(8)

The bound of Theorem 1 holds if we replace (4) by this weaker condition. In this case we have $\Delta \le C_*N^{-1-\nu }$, where the constant $C_*$ depends on $A_*,D_*, G_*$, $M_*,r,s,\nu _1,\nu _2, \delta $.

In the particular case of U statistics of degree three (the case where $R\equiv 0$ in (1)) the proof of Theorem 1 has been outlined in the unpublished manuscript by Götze and van Zwet [25]. We provide a complete and more readable version of the arguments sketched in that preprint and extend them to a general class of symmetric statistics. In the same paper [25], see as well [4], it was shown that moment conditions (like (5), (6)) together with Cramér’s condition (like (7)) do not suffice for the bound $\Delta =o(N^{-1})$. For convenience we state this result in Example 1 below.

Example 1

Let $X_1,X_2,\dots $ be independent random variables uniformly distributed on the interval $(-1/2,1/2)$. Define $T_N=(W_N+N^{-1/2}V_N)(1-N^{-1/2}V_N)$, where $V_N=N^{-1/2}\sum \{N^{1/2}X_j\}$ and $W_N=N^{-1}\sum [N^{1/2}X_j]$. Here [x] denotes the nearest integer to x and $\{x\}=x-[x]$.

Assume that $N=m^2$, where m is odd. We have, by the local limit theorem,

$$\begin{aligned} \mathbf{P}\{W_N=1\}\ge cN^{-1} \qquad {\text { and}} \qquad \mathbf{P}\{|V_N|<\delta \}>c\delta , \qquad 0<\delta <1, \end{aligned}$$

where $c>0$ is an absolute constant. From these inequalities it follows by the independence of $W_N$ and $V_N$, that $\mathbf{P}\{1-\delta ^2 N^{-1}\le T_N\le 1\}\ge c^2\delta N^{-1}$.

The example defines a sequence of U-statistics $\mathbb T_N$ whose distribution functions $\mathbb F_N$ have $O(N^{-1})$ sized increments in a particular interval of length $o(N^{-1})$. These fluctuations of magnitude $O(N^{-1})$ appear as a result of a nearly lattice structure induced by the interplay between the (smooth) linear part and the quadratic part.

1.3 Earlier work

There is a rich literature devoted to normal approximation and Edgeworth expansions for various classes of asymptotically linear statistics (see e.g. Babu and Bai [2], Bai and Rao [3], Bentkus, Götze and van Zwet [4], Bhattacharya and Ghosh [8, 9], Bhattacharya and Rao [7], Bickel [10], Bickel, Götze and van Zwet [11], Callaert, Janssen and Veraverbeke [16], Chibisov [17], Hall [28], Helmers [29], Petrov [33], Pfanzagl [34], Serfling [35], etc.

A wide class of statistics can be represented as functions of sample means of vector variables. Edgeworth expansions of such statistics can be obtained by applying the multivariate expansion to corresponding functions, see Bhattacharya and Ghosh [8, 9]. In their work the crucial Cramér condition (C) is assumed on the joint distribution of all the components of a vector which may be too restrictive in cases where some components have a negligible influence on the statistic. More often only one or a few of the components satisfy a conditional version of condition (C). Bai and Rao [3], Babu and Bai [2] established Edgeworth expansions for functions of sample means under such a conditional Cramér condition. This approach exploits the smoothness of the distribution of a random vector as well as the smoothness of the function defining the statistic. In particular this approach needs a class of statistics which are smooth functions of observations or can be approximated by such functions via Taylor’s expansion, see also Chibisov [17]. The respective condition (6) of the present paper is expressed in terms of moments of iterated differences $\Delta _m$ and does not assume Taylor’s expansion.

Let us note that generally the smoothness of the distribution function of $\mathbb T$ may have little to do with the smoothness of the function $\mathbb T(X_1,\dots , X_N)$ of observations $X_1,\dots , X_N$. Just take Gini’s mean difference $\sum _{i<j}|X_i-X_j|$ with absolutely continuous $X_i$ for example. Another interesting example is about Studentization, when it enchances the smoothness of the distribution function of a sum of lattice random variables dramatically, see [26]. Our Theorem 1 shows, in particular, that structural condition (4) together with (7) guarantee the smoothness of the distribution of $\mathbb T$ necessary for the bound $\Delta =o(N^{-1})$.

In order to compare Theorem 1 with earlier results of similar nature let us consider the case of U-statistics of degree two

$$\begin{aligned} \mathbb U=\frac{\sqrt{N}}{2} \left( {\begin{array}{c}N\\ 2\end{array}}\right) ^{-1}\sum _{1\le i<j\le N}h(X_i,X_j), \end{aligned}$$

(9)

where $h(\cdot ,\cdot )$ denotes a (fixed) symmetric kernel. Assume for simplicity of notation and without loss of generality that $\mathbf{E}h(X_1,X_2)=0$. Write $h_1(x)=\mathbf{E}(h(X_1,X_2)|X_1=x)$ and assume that $\sigma _h^2>0$, where $\sigma _h^2=\mathbf{E}h_1^2(X_1)$. In this case Hoeffding’s decomposition (1) reduces to $\mathbb U=L+Q$, where, by the assumption $\sigma _h^2>0$, we have $\mathbf{Var}L>0$. Since the cubic part vanishes we remove the moment $\mathbf{E}g(X_1)g(X_2)g(X_3)\chi (X_1,X_2,X_3)$ from the expression for $\kappa _4$. In this way we obtain the two term Edgeworth expansion (2) for the distribution function $\mathbb F_U(x)=\mathbf{P}\{\mathbb U\le \sigma _\mathbb Ux\}$ with $\sigma ^2_\mathbb U:=\mathbf{Var}\mathbb U$.

We call h reducible if for some measurable functions $u,v:\mathcal X\rightarrow R$ we have $h(x,y)=v(x)u(y)+v(y)u(x)$ for $P_X\times P_X$ almost sure $(x,y)\in \mathcal X\times \mathcal X$. A simple calculation shows that for a sequence of U-statistics (9) with a fixed non-reducible kernel condition (4) is satisfied, for some $\delta _{*}>0$, uniformly in N. A straightforward consequence of Theorem 1 is the following corollary. Write ${\tilde{\nu }}=600^{-1}\min \{\nu _2,r-4,1\}$.

Corollary 1

Assume that $\mathbf{E}h(X_1,X_2)=0$ and for some $r>4$

$$\begin{aligned} \mathbf{E}|h(X_1,X_2)|^r<\infty . \end{aligned}$$

(10)

Assume that $\sigma _h^2>0$ and the kernel h is non-reducible and that for some $\delta >0$

$$\begin{aligned} \sup \{|\mathbf{E}e^{it\sigma _h^{-1}h_1(X_1)}|:\, |t|\ge \beta _3^{-1}\} \le 1-\delta . \end{aligned}$$

(11)

Then there exist a constant $C_*>0$ such that

$$\begin{aligned} \sup _{x\in \mathbb R}|\mathbb F_U(x)-G(x)| \le C_*N^{-1-{\tilde{\nu }}}. \end{aligned}$$

For U-statistics with a fixed kernel the validity of the Edgeworth expansion (2) up to the order $o(N^{-1})$ was established by Callaert, Janssen and Veraverbeke [16] and Bickel, Götze and van Zwet [11]. In addition to the moment conditions (like (10)) and Cramér’s condition (like (11)) Callaert, Janssen and Veraverbeke [16] imposed the following rather implicit condition. They assumed that for some $0<c<1$ and $0<\alpha <1/8$ the event

$$\begin{aligned} \Bigl |\mathbf{E}\bigl (\exp \{it\sigma _\mathbb U^{-1}\sum _{j=m+1}^Nh(X_1,X_j)\}\,\bigl | X_{m+1},\dots , X_N\bigr )\Bigr |\le c \end{aligned}$$

(12)

has probability $1-o(1/N\log N)$ uniformly for all $t\in [N^{3/4}/\log N,\, N\log N]$. Here $m\approx N^\alpha $, for a small positive $\alpha $. Bickel, Götze and van Zwet [11] more explicitly required that the linear operator, $f(\cdot )\rightarrow \mathbf{E}\psi (X,\cdot )f(X)$ defined by $\psi $ has a sufficiently large number of non-zero eigenvalues (the number depending on the existing moments, but always larger than 4). Correspondingly the eigenvalue condition is stronger than the non-reducibility condition of Corollary 1 since for a reducible kernel h the linear operator $f(\cdot )\rightarrow \mathbf{E}\psi (X,\cdot )f(X)$ has at most two eigenvalues. On the other hand, it is difficult to compare the structural non-reducibility condition with condition (12) whose technical nature is discussed in the outline of the proof at the beginning of Sect. 2.

The remaining parts of the paper (Sects. 2–5) contain the proof of Theorem 1. Auxiliary results are placed in the “Appendix”.

2 Proof of Theorem 1

2.1 Proof highlights

After the seminal paper of Esseen [22] a standard proof of the validity of the normal approximation and its refinements proceeds in two steps. In the first step, with the aid of a smoothing inequality, the Kolmogorov distance between the distribution function and its approximation G is upper bounded by a (weighted) average difference of the respective Fourier transforms, see (25). In the second step one performs a careful analysis of the Fourier transforms: for frequencies $t=O(\sqrt{N})$ one shows the closeness between the respective Fourier transforms, while for the remaining range $\Omega (\sqrt{N})\le |t|\le O(T)$ one establishes their exponential decay. The cut-of T is defined by the desired approximation accuracy level $O(T^{-1})$ (in our case $T=N^{1+\nu })$). The approach, initially developed for sums of independent random variables [22, 33], was later applied to non-degenerate U-statistics [11, 16] and general asymptotically linear symmetric statistics [4, 37].

One particular problem related to the implementation of the proof strategy outlined above is about establishing exponential decay of the (absolute value of the) Fourier transform in the range of large frequencies. For a linear statistic this problem is elegantly resolved by introducing Cramér’s condition. Indeed, in view of the multiplicity property of the Fourier transform, the Cramér condition implies the desired exponential decay. Consequently, Cramér’s condition together with moment conditions ensure the validity of an Edgeworth expansion of an arbitrary order. But the multiplicity property can not be used any more (at least directly) when we turn to general symmetric statistics because various parts (linear, quadratic etc.) are mutually dependent. This fact leads to considerable difficulties in estimating the respective Fourier transforms in the range of large frequencies $t\gg N$ and requires new conditions to control the above mentioned dependencies. The present paper suggests a novel approach to estimation of the Fourier transform of a symmetric statistic for large frequencies.

As our general setup of symmetric statistics covers linear ones, we keep assuming the Cramér condition, but on the linear part of the statistic only, see (7). In view of Example 1, condition (7) is not enough. We introduce the additional structural condition (4), which together with (7) guarantees the desired $O(N^{-1-\nu })$ upper bound on the weighted average of the Fourier transform over the frequency range $N^{1-\nu }\le |t|\le N^{1+\nu }$, see (26) below. Condition (4) is optimal and natural in the sense that it matches the counterexample. It has first appeared in the unpublished manuscript [25] by Götze and van Zwet in the case of U statistics.

Let us compare (4) with alternative conditions introduced in earlier papers by Callaert, Janssen and Veraverbeke [16] and Bickel, Götze and van Zwet [11] in the case of U statistics of degree two. The conditional Cramér condition (12) of [16] forces the multiplicity property of the Fourier transform in a formal way thus circumventing the problem of establishing relation between the structure of the kernel (of U statistic) and the smoothness of the distribution. Therefore (4) and (12) are not comparable. This is not the case with the eigenvalue condition of [11], which is stronger than (4). In their proof Bickel, Götze and van Zwet [11] have used for the frequencies $t\in [N^{(r-1)/r}/\log N,\, N\log N]$ a symmetrization technique of [23] which essentially estimates the absolute value of the Fourier transform of U by that of a bilinear version of Q thus neglecting L and its smoothess properties implied by Cramér’s condition (7). The approach of the present paper instead makes use of the smoothness of L and Q simultaneously.

The main contribution of this paper is in showing that condition (4) suggested by the counterexample (Example 1) is sufficient to prove the bound of Theorem 1. This condition is used in constructing estimates of weighted averages of the Fourier transform (26) that we briefly comment below. In fact, after initial “linearization” step we turn to slightly modified statistic ${\tilde{\mathbb T}}(X_1,\dots , X_N)$, where the nonlinear terms in $X_1,\dots , X_m$ are removed (see (19)), and then switch to $T' ={\tilde{\mathbb T}}(X_1,\dots , X_m, Y_{m+1},\dots , Y_N)$, where $Y_{m+1},\dots , Y_{N}$ are truncated versions of $X_{m+1},\dots , X_{N}$, see (42). Let $\mathbf{E}_{\mathbb Y}$ denote the conditional expectation given $\mathbb Y=(Y_{m+1},\dots , Y_N)$. The conditional Fourier transform $\mathbf{E}_{\mathbb Y}\exp \{itT'\}=\mathbf{E}\bigl (\exp \{itT'\}\bigl |Y_{m+1},\dots , Y_N\bigr )$ contains a multiplicative component $\alpha _t^m$, where

$$\begin{aligned} \alpha _t = \mathbf{E}_{\mathbb Y} \exp \left\{ itN^{-1/2}g(X_1)+itN^{-3/2}\sum _{l=m+1}^N\psi (X_1,Y_l)\right\} . \end{aligned}$$

(13)

For t satisfying $|\alpha _t|^2\le 1-m^{-1}\ln ^2N$ the bound $|\mathbf{E}_{\mathbb Y}\exp \{itT'\}|\le \exp \{-0.5\ln ^2N\}$ follows immediately. We then look carefully at the set of remaining t. We show that this set is a union of non-intersecting intervals (depending on $\mathbb Y$) each of size $O(\sqrt{N/m}\ln N)$. While estimating the weighted averages of the Fourier transform over these intervals we split the frequency domain $N^{1-\nu }\le |T|\le N^{1+\nu }$ into a deterministic sequence $J_p$, $p=1,2,\dots $, of consecutive intervals of size $\Theta (N^{1-\nu })$ so that each ‘singular’ set $\{t\in J_p:\, |\alpha _t|^2>1-m^{-1}\ln ^2N\}$ is either empty or an interval $[a_N, a_N+b_N^{-1}]$ of size $b_N^{-1}=O\bigl (\sqrt{N/m}\ln N\bigr )$, (see (51) and (56) based on Lemma 12). At the very last step, using Kleitman’s concentration inequalities for sums of random variables with values in a function space, we upper bound the probability of the event that each particular singular set is non-empty, that is, the event that $\sup _{t\in J_p}|\alpha _t|^2> 1-m^{-1}\ln ^2N$ thus obtaining an extra factor $N^{-k\nu }$, $k\ge 5$ to arrive to the error bound $o(N^{-1})$.

More precisely, the non-zero projection to the g orthogonal part of $\sum _{l=m+1}^N\psi (\cdot ,Y_l)$ which is non zero by condition (4) is used in the crucial Lemma 2. Via conditioning and randomization we represent it as a sum $S_{\alpha }:=\sum _{j=1}^n \alpha _j f_j$ of independent $\alpha _j=0,1$ variables with vectors $f_j$ with $||f_j||> \epsilon $ and estimate the combinatorial probability for those $\alpha =(\alpha _1,\dots , \alpha _n)$ that a value larger than $1- m^{-1} ln^2 N$ of the conditional Fourier transform, say $\tilde{\phi }_t(\alpha )$, of $f+ S_{\alpha }$ occurs at some ’singular’ frequency $ t \in J_p$. This is achieved by Kleitman’s partition of the $2^n$ $\alpha $’s into at most $\left( {\begin{array}{c}n\\ n/2\end{array}}\right) $ disjoint sets, say $C_d, 1\le d\le \left( {\begin{array}{c}n\\ n/2\end{array}}\right) $, such that for different $\alpha , \alpha '\in C_d$, $S_{\alpha }$ and $S_{\alpha '}$ are separated by a distance of at least $\epsilon $. This separation implies by Lemma 2 that the event that t is singular somewhere in the interval $J_p$ can be witnessed by at most one $\alpha \in C_d$ for each $C_d$. Hence the singular event among the $\alpha 's$ has combinatorial probability at most $\left( {\begin{array}{c}n\\ n/2\end{array}}\right) 2^{-n} =O(n^{-1/2})$.

The crucial arguments in Lemma 2 rest upon the observation on harmonics (see (118)) that two singular values $\tilde{\phi }_t(\alpha ), \tilde{\phi }_s(\alpha ' ) \ge 1-m^{-1}\ln ^2N $ imply a similar high value of $\mathbf{E}\exp \{i( t (f+ S_{\alpha }) - s(f+ S_{\alpha '}))\}$. If here t and s are close, say $|t-s| \le \delta _2$, such a high value is excluded by the separation of $S_{\alpha }$ and $S_{\alpha }$ which dominate $(t-s)f$ (see step 4.2.1 in Lemma 2), whereas for $ \delta _2<|t-s| < N^{\nu -1/2}$, Cramér’s condition for $(t-s)f$ applies which together with size bounds on $t S_{\alpha }$ and $sS_{\alpha '}$ again prevents a high value (see step 4.2.2 in Lemma 2).

Note that this method of width bounds and separation of singular sets of Fourier transforms has been successfully employed for optimal approximation results for U-statistics with non-Gaussian limits by Bentkus, Götze and Zaitsev, see [5] and [27] and is strongly related to results on the distribution of quadratic forms on lattices by Bentkus and Götze, see [6] and [24], the latter providing a solution of the Davenport-Lewis conjecture for positive definite forms.

Finally, we mention that in the case of U statistics of degree three ($\mathbb T=\mathbf{E}\mathbb T+L+Q+K$) the proof is outlined in the unpublished manuscript of Götze and van Zwet [25]. We extend these arguments to general symmetric statistics using stochastic expansions by means of Hoeffding’s decomposition and bounds for various parts of the decomposition.

2.2 Outline of the proof

Firstly, using the linear structure induced by Hoeffding’s decomposition we replace $\mathbb T/\sigma _\mathbb T$ by the statistic ${\tilde{\mathbb T}}$ which is conditionally linear given $X_{m+1},\dots , X_N$. Secondly, invoking a smoothing inequality we pass from distribution functions to Fourier transforms. In the remaining steps we bound the difference $\delta (t)=\mathbf{E}e^{it {\tilde{\mathbb T}}}- {\hat{G}}(t)$, for $|t|\le N^{1+\nu }$. For "small frequencies" $|t|\le C N^{1/2}$, we expand the characteristic function $\mathbf{E}e^{it {\tilde{\mathbb T}}}$ in order to show that $\delta (t)=o(N^{-1})$. Here we combine various techniques developed in earlier papers [4, 11, 16]. For remaining range of frequencies, that is $C N^{1/2}\le |t|\le N^{1+\nu }$, we bound the summands $\mathbf{E}e^{it {\tilde{\mathbb T}}}$ and ${\hat{G}}(t)$ separately. The cases of "large frequencies" $N^{1-\nu }\le |t|\le N^{1+\nu }$ and "medium frequencies" $C\sqrt{N}\le |t|\le N^{1-\nu }$ are treated in a different manner. For medium frequencies the Cramér type condition (7) ensures an exponential decay of $|\mathbf{E}e^{it {\tilde{\mathbb T}}}|$. For large frequencies we combine conditions (7) and (4).

2.3 Hoeffding’s decomposition

Before starting the proof we introduce some notation. By $c_*$ we shall denote a positive constant which may depend only on $A_*,D_*,M_*, r, s, \nu _1,\nu _2, \delta $, but it does not depend on N. In different places the values of $c_*$ may be different.

It is convenient to write the decomposition in the form

$$\begin{aligned} \mathbb T=\mathbf{E}\mathbb T+\sum _{1\le k\le N}\mathbb U_k, \qquad \mathbb U_k=\sum _{1\le i_1<\cdots <i_k\le N}g_k(X_{i_1},\dots , X_{i_k}), \end{aligned}$$

(14)

where, for every k, the symmetric kernel $g_k$ is centered, i.e., $\mathbf{E}g_k(X_1,\dots , X_k)=0$, and satisfies, see, e.g., [4],

$$\begin{aligned} \mathbf{E}\bigl (g_k(X_1,\dots , X_k)\bigr |X_2,\dots , X_k)=0 \qquad {\text {almost surely}}. \end{aligned}$$

(15)

Here we write $g_1:= N^{-1/2}g$, $g_2:= N^{-3/2}\psi $ and $g_3:= N^{-5/2}\chi $. Furthermore, for an integer $k>0$ we write $\Omega _k:=\{1,\dots , k\}$. Given a subset $A=\{i_1,\dots , i_k\}\subset \Omega _N$ we write, for short, $T_A:=g_k(X_{i_1},\dots , X_{i_k})$. Put $T_{\emptyset }:=\mathbf{E}\mathbb T$. Now the decomposition (14) can be written as follows

$$\begin{aligned} {\mathbb T} = \mathbf{E}\mathbb T+\sum _{1\le k\le N}{\mathbb U}_k = \sum _{A\subset \Omega _N}T_A, \qquad {\mathbb U}_k=\sum _{|A|=k,\, A\subset \Omega _N}T_A. \end{aligned}$$

2.4 Proof of Theorem 1

Throughout the proof we assume without loss of generality that

$$\begin{aligned} 4<r\le 5, \qquad 2<s\le 3 \qquad {\text {and}} \qquad \mathbf{E}\mathbb T=0, \qquad \sigma _\mathbb T^2=1. \end{aligned}$$

(16)

Denote, for $t>0$,

$$\begin{aligned} \beta _t=\sigma ^{-t}\mathbf{E}|g(X_1)|^t, \qquad \gamma _t=\mathbf{E}|\psi (X_1,X_2)|^t, \qquad \zeta _t=\mathbf{E}|\chi (X_1,X_2,X_3)|^t. \end{aligned}$$

Linearization. Choose number $\nu >0$ and integer m such that

$$\begin{aligned} \nu =600^{-1}\min \{\nu _1,\, \nu _2,\, s-2, \, r-4\}, \qquad m\approx N^{100\nu }. \end{aligned}$$

(17)

Split

$$\begin{aligned} {\mathbb T}={\mathbb T}_{[m]}+{\mathbb W}, \qquad {\mathbb T}_{[m]}=\sum _{A:\, A\cap \Omega _m\ne \emptyset }T_A, \qquad {\mathbb W}=\sum _{A:\, A\cap \Omega _m=\emptyset }T_A. \end{aligned}$$

(18)

Furthermore, write

$$\begin{aligned} {\mathbb T}_{[m]}&= {\mathbb U}_1^*+{\mathbb U}_2^*+\Lambda , \qquad \Lambda =\Lambda _1+\Lambda _2+\Lambda _3+\Lambda _4+\Lambda _5, \\ {\mathbb U}_1^*&= \sum _{i=1}^mT_{\{i\}}, \qquad {\mathbb U}_2^* = \sum _{i=1}^m\sum _{j=m+1}^N T_{\{i,j\}}, \\ \Lambda _1&= \sum _{1\le i<j\le m} T_{\{i,j\}}, \qquad \Lambda _2 = \sum _{|A|\ge 3, |A\cap \Omega _m|=2} T_A, \\ \Lambda _3&= \sum _{A:\, |A\cap \Omega _m|\ge 3} T_A, \qquad \Lambda _4=\sum _{|A|=3,\, |A\cap \Omega _m|=1}T_A, \\ \Lambda _5&= \sum _{i=1}^m\eta _i, \qquad \eta _i=\sum _{|A|\ge 4,\, A\cap \Omega _m=\{i\}}T_A. \end{aligned}$$

Before applying a smoothing inequality we replace $\mathbb F(x)$ by

$$\begin{aligned} {\tilde{\mathbb F}}(x):=\mathbf{P}\{{\tilde{\mathbb T}}\le x\}, \qquad {\text { where}} \qquad {\tilde{\mathbb T}} = {\mathbb U}_1^*+{\mathbb U}_2^*+{\mathbb W}= {\mathbb T}-\Lambda . \end{aligned}$$

(19)

In order to show that $\Lambda $ can be neglected we apply a simple Slutzky type argument. Given $\varepsilon >0$, we have

$$\begin{aligned} \Delta \le \sup _{x\in \mathbb R}|{\tilde{\mathbb F}}(x)-G(x)| + \varepsilon \,\sup _{x\in \mathbb R}|G'(x)|+\mathbf{P}\{|\Lambda |>\varepsilon \}. \end{aligned}$$

(20)

From Lemma 5 we obtain via Chebyshev’s inequality, for $\varepsilon =N^{-1-\nu }$,

$$\begin{aligned} \mathbf{P}\{|\Lambda |>\varepsilon \}&\le \sum _{i=1}^5\mathbf{P}\left\{ |\Lambda _i|>\frac{\varepsilon }{5}\right\} \\&\le \left( \frac{5}{\varepsilon }\right) ^3\mathbf{E}|\Lambda _1|^3 + \left( \frac{5}{\varepsilon }\right) ^2(\mathbf{E}\Lambda _2^2+\mathbf{E}\Lambda _3^2+\mathbf{E}\Lambda _5^2) + \left( \frac{5}{\varepsilon }\right) ^s\mathbf{E}|\Lambda _4|^s \\&\le c_*N^{-1-\nu }. \end{aligned}$$

In the last step we used conditions (5), (6) and the inequality (168). Furthermore, using (5) and (6) one can show that

$$\begin{aligned} \sup _{x\in \mathbb R}|G'(x)|\le c_*. \end{aligned}$$

(21)

Therefore, (20) implies

$$\begin{aligned} \Delta \le {\tilde{\Delta }}+c_*N^{-1-\nu }, \qquad {\text {where}} \qquad {\tilde{\Delta }}:=\sup _{x\in \mathbb R}|{\tilde{\mathbb F}}(x)-G(x)|. \end{aligned}$$

It remains to show that ${\tilde{\Delta }}\le c_*N^{-1-\nu }$.

A smoothing inequality. Given $a>0$ and even integer $k\ge 2$ consider the probability density function, see (10.7) in Bhattacharya and Rao [7],

$$\begin{aligned} x\rightarrow g_{a,k}(x)=a \,c(k)(ax)^{-k}\sin ^k(ax), \end{aligned}$$

(22)

where c(k) is the normalizing constant. Its characteristic function

$$\begin{aligned} {\hat{g}}_{a,k}(t) = \int _{-\infty }^{+\infty } e^{itx}g_{a,k}(x)dx = 2\pi \, a \, c(k)u^{*k}_{[-a,a]}(t) \end{aligned}$$

vanishes outside the interval $|t|\le ka$. Here $u^{*k}_{[-a,a]}(t)$ denotes the probability density function of the sum of k independent random variables each uniformly distributed in $[-a,a]$. It is easy to show that the function $t\rightarrow {\hat{g}}_{a,k}(t)$ is unimodal and symmetric around $t=0$.

Let $\mu $ be the probability distribution with the density $g_{a,2}$, where a is chosen to satisfy $\mu ([-1,1])=3/4$. Given $T>1$ define $\mu _T({\mathcal{A}})=\mu (T{\mathcal{A}})$, for a Borel set ${\mathcal{A}}\subset \mathbb R$. Let ${\hat{\mu }}_T$ denote the characteristic function corresponding to $\mu _T$.

We apply Lemma 12.1 of [7]. It follows from (21) and the identity $\mu _T([-T^{-1}, T^{-1}])= 3/4$ that

$$\begin{aligned} {\tilde{\Delta }} \le 2\sup _{x\in \mathbb R}\bigl |({\tilde{{\mathcal{F}}}}-{\mathcal{G}})*\mu _T(-\infty ,x]\bigr | + c_*T^{-1}. \end{aligned}$$

(23)

Here ${\tilde{\mathcal{F}}}$ and $\mathcal{G}$ denote the probability distribution of ${\tilde{\mathbb T}}$ and the signed measure with density function $G'(x)$ respectively. Furthermore, $*$ denotes the convolution operation. Proceeding as in the proof of Lemma 12.2 ibidem we obtain

$$\begin{aligned} ({\tilde{\mathcal{F}}}-\mathcal{G})*\mu _T(-\infty ,x] = \frac{1}{2\pi }\int _{-\infty }^{+\infty } e^{-itx} \Bigl ( \mathbf{E}e^{it{\tilde{\mathbb T}}}-{\hat{G}}(t)\Bigr ) \frac{{\hat{\mu }}_T(t)}{-it}dt, \end{aligned}$$

(24)

where ${\hat{G}}$ denotes the Fourier transform of G(x). Note that ${\hat{\mu }}_T(t)$ vanishes outside the interval $|t|\le 2aT$. Finally, we obtain from (23) and (24) that

$$\begin{aligned} {\tilde{\Delta }} \le \frac{1}{\pi } \sup _{x\in \mathbb R}|I(x)| +c_*\frac{2a}{T}, \qquad I(x):= \int _{-T}^{T} e^{-itx} \bigl (\mathbf{E}e^{ it {\tilde{\mathbb T}} } - {\hat{G}} (t)\bigr ) \frac{{\hat{\mu }}_{T'}(t)}{-it}dt, \end{aligned}$$

(25)

where $T'=T/2a$. Here we use the fact that ${\hat{\mu }}_{T'}(t)=0$ for $|t|>T$. Denote $K_N(t)= {\hat{\mu }}_{T'}(t)$ and observe that $|K_N(t)|\le 1$ (since $\mu _{T'}$ is a probability measure). Let

$$\begin{aligned} T=N^{1+\nu }. \end{aligned}$$

We have

$$\begin{aligned}&|I(x)|\le I_1+I_2+|I_3|+|I_4|, \\&I_1 = \int _{|t|\le t_1} \bigl |\mathbf{E}e^{ it {\tilde{\mathbb T}} } - {\hat{G}}(t)\bigr | \frac{dt}{|t|}, \quad I_2=\int _{t_1<|t|<T}|{\hat{G}}(t)|\frac{dt}{|t|}, \\&I_3=\int _{t_1<|t|<t_2}e^{-itx}\mathbf{E}e^{it{\tilde{\mathbb T}}}\frac{K_N(t)}{-it}dt, \quad I_4=\int _{t_2<|t|<T}e^{-itx}\mathbf{E}e^{it{\tilde{\mathbb T}}}\frac{K_N(t)}{-it}dt. \end{aligned}$$

Here we denote $t_1=N^{1/2}10^{-3}/\beta _3$ and $t_2=N^{1-\nu }$. In view of (25) the bound ${\tilde{\Delta }}\le c_*N^{-1-\nu }$ follows from the bounds

$$\begin{aligned} |I_k|\le c_*N^{-1-\nu }, \quad k=1,2,3, \quad {\text {and}} \quad |I_4|\le c_*N^{-1-\nu }(1+\delta _*^{-1}N^{-\nu }). \end{aligned}$$

(26)

The bound $I_2\le c_*N^{-1-\nu }$ is a consequence of the exponential decay of $|{\hat{G}}(t)|$ as $|t|\rightarrow \infty $. In Sect. 3 we show (26) for $k=3,4$. The proof of (26), for $k=1$, is based on careful expansions and is given Sect. 5.

3 Large frequencies

Here we prove inequalities (26) for $I_3$ and $I_4$. The proof of $|I_3|\le c_*N^{-1-\nu }$ is relatively simple and is deferred to the end of the section.

Let us upper bound $|I_4|$. We will show that

$$\begin{aligned} \Bigl | \int _{N^{1-\nu }<|t|<N^{1+\nu }}e^{-itx}\mathbf{E}e^{it{\tilde{\mathbb T}}}\frac{K_N(t)}{t}dt \Bigr | \le c_*\frac{1+\delta _{*}^{-1}}{N^{1+2\nu }}. \end{aligned}$$

(27)

In what follows we assume that N is sufficiently large, say $N>C_*$, where $C_*$ depends only on $A_*,D_*,M_*,r,s,\nu _1,\nu _2,\delta $. We use this inequality in several places below, where the constant $C_*$ can be easily specified. Note that for small N such that $N\le C_*$ the inequality (27) becomes trivial.

3.1 Notation

Let us first introduce some notation. Introduce the number

$$\begin{aligned} \alpha =3/(r+2) \end{aligned}$$

(28)

and note that for $r\in (4,5]$ and $\nu $ defined by (17) we have

$$\begin{aligned} 2/r<\alpha<1/2 \qquad {\text {and}} \qquad 80\nu <\min \{r\alpha -2,\, 1-2\alpha \}. \end{aligned}$$

Given N introduce the integers

$$\begin{aligned} n\approx N^{50\nu }, \qquad M=\lfloor (N-m)/n\rfloor . \end{aligned}$$

(29)

We have $N-m=M\,n+s$, where the integer $0\le s<n$. Observe, that the inequalities $\nu <600^{-1}$ and $m<N^{1/2}$, see (17), imply $M>n$. Therefore $s<M$. Split the index set

$$\begin{aligned}&\{m+1,\dots , N\}=O_1\cup O_2\cup \dots \cup O_{n}, \nonumber \\&O_i=\{j:\, m+(i-1)M< j\le m+iM\}, \quad 1\le i\le n-1, \nonumber \\&O_n=\{j:\, m+(n-1)M<j\le N\}. \end{aligned}$$

(30)

Clearly, $O_1,\dots , O_{n-1}$ are of equal size (=M) and $|O_n|=M+s<2M$.

We shall assume that the random variable $X:\Omega \rightarrow \mathcal X$ is defined on the probability space $(\Omega , P)$ and $P_X$ is the probability distribution on $\mathcal X$ induced by X. Given $p\ge 1$ let $L^p=L^p(\mathcal X,P_X)$ denote the space of real functions $f:\mathcal X\rightarrow \mathbb R$ with $\mathbf{E}|f(X)|^p<\infty $. Denote $\Vert f\Vert _p=(\mathbf{E}|f(X)|^p)^{1/p}$. With a random variable f(X) we associate an element (vector) $f=f(\cdot )$ of $L^p$, $p\le r$. Let $p_g:L^2\rightarrow L^2$ denote the projection onto the subspace orthogonal to the vector $g(\cdot )$ in $L^2$. Given $h\in L^2$, decompose

$$\begin{aligned} h=a_hg+h^*, \qquad {\text {where}} \qquad a_h=\left< h,g\right>\Vert g\Vert _2^{-2} \qquad {\text {and}} \qquad h^*=p_g(h). \end{aligned}$$

(31)

Here $\left<h,g\right>=\int h(x)g(x)P_X(dx)$. For $h\in L^r$ we have

$$\begin{aligned} \Vert h\Vert _r\ge \Vert h\Vert _2\ge \Vert h^*\Vert _2. \end{aligned}$$

(32)

Furthermore, for $r^{-1}+v^{-1}=1$ (here $r\ge 2\ge v>1$) we have, by Hölder’s inequality,

$$\begin{aligned} |\left<h,g\right>|\le \Vert h\Vert _r\Vert g\Vert _v\le \Vert h\Vert _r\Vert g\Vert _2. \end{aligned}$$

In particular,

$$\begin{aligned} |a_h|\le \Vert h\Vert _r/\Vert g\Vert _2. \end{aligned}$$

(33)

Denote

$$\begin{aligned} c_g:=1+\Vert g\Vert _r/\Vert g\Vert _2, \qquad c_g^*:=1+M_*^{1/r}A_*^{-1/2} \end{aligned}$$

and observe that $c_g\le c_g^*$. It follows from the decomposition (31) and (33) that

$$\begin{aligned} \Vert h^*\Vert _r \le \Vert h\Vert _r+|a_h|\,\Vert g\Vert _r\le \Vert h\Vert _r(1+\Vert g\Vert _r/\Vert g\Vert _2) = c_g\Vert h\Vert _r. \end{aligned}$$

(34)

Introduce the numbers

$$\begin{aligned} a_1 = \frac{1}{4} \min \left\{ \frac{1}{12c_g^*}, \, \frac{(c_rA_*/2^rM_*)^{1/(r-2)}}{1+4A_*^{-1/2}} \right\} , \qquad c_r=\frac{7}{24}\frac{1}{2^{r-1}}. \end{aligned}$$

(35)

It follows from (7) that there exist $\delta ',\delta ''>0$ depending on $A_*, M_*, \delta $ such that (uniformly in N) Cramér’s characteristic $\rho $ satisfies the inequalities

$$\begin{aligned} \rho (a_1,2N^{-\nu + 1/2})\ge \delta ', \qquad \rho ((2\beta _3)^{-1},N^{\nu _2+1/2})\ge \delta ''. \end{aligned}$$

(36)

We shall prove the first inequality only. In view of (7) it suffices to prove that $\rho (a_1,\beta _3^{-1})\ge \delta '$. Expanding the exponent in powers of $itg(X_1)/\sigma $ we show the inequality

$$\begin{aligned} |\mathbf{E}e^{it\sigma ^{-1}g(X_1)}|\le 1-2^{-1}t^2(1-3^{-1}|t|\beta _3). \end{aligned}$$

For $|t|\le \beta _3^{-1}$ this inequality implies

$$\begin{aligned} |\mathbf{E}e^{it\sigma ^{-1}g(X_1)}|\le 1-t^2/3. \end{aligned}$$

Therefore, $\rho (a_1,\beta _3^{-1})\ge a_1^2/3$ and we can choose $\delta '=\min \{\delta , a_1^2/3\}$ in (36).

Introduce the constant (depending only on $A_*,M_*,\delta $)

$$\begin{aligned} \delta _1= \delta '/(10 c_g^*). \end{aligned}$$

(37)

Note that $0<\delta _1<1/10$. Given $f\in L^r$ and $T_0\in {\mathbb R}$ such that

$$\begin{aligned} N^{-\nu +1/2}\le |T_0|\le N^{\nu +1/2}, \end{aligned}$$

(38)

denote

$$\begin{aligned}&I(T_0)=[T_0, \, T_0+\delta _1N^{-\nu +1/2}],\nonumber \\&u_t(f)=\int \exp \bigr \{it\bigl (g(x)+N^{-1/2}f(x)\bigr )\bigr \}P_X(dx), \nonumber \\&v(f) = \sup _{t\in I(T_0)}|u_t(f)|, \qquad \tau (f)=1-v^2(f). \end{aligned}$$

(39)

Given a random variable $\eta $ with values in $L^r$ and number $0<s< 1$, define

$$\begin{aligned} d_s(\eta ,I(T_0)) = \mathbb I_{\{v^2(\eta )>1-s^2\}} \mathbb I_{\{\Vert \eta \Vert _r\le N^{\nu }\}}, \qquad \delta _s(\eta ,I(T_0))=\mathbf{E}d_s(\eta , I(T_0)). \end{aligned}$$

(40)

Introduce the function

$$\begin{aligned} \psi ^{**}(x,y)=\psi (x,y)-b(x)g(y)-b(y)g(x) \end{aligned}$$

(41)

and the number

$$\begin{aligned} \delta _3^2=\mathbf{E}|\psi ^{**}(X_1,X_2)|^2. \end{aligned}$$

It follows from (4) and our assumption $\sigma _\mathbb T^2=1$, see (16), that $\delta _3^2\ge \delta _{*}^2$.

3.2 Proof of (27)

We write $\mathbf{E}_{\mathbb Y}\exp \{itT'\}$ in the form $\mathbf{E}_{\mathbb Y}\exp \{itT'\}=\alpha _t^m\exp \{itW'\}$, where $\alpha _t$ is defined in (13) and where the random variable $W'$ is defined in the same way as ${\mathbb W}$ in (18), but with $T_A=g_k(X_{i_1},\dots , X_{i_k})$ replaced by $g_k(Y_{i_1},\dots , Y_{i_k})$ for each $A=\{i_1,\dots , i_k\}$. A standard way to upper bound a quantity like $|\mathbf{E}_{\mathbb Y}e^{itT'}|$ is to show an exponential decay (in m) of the product $|\alpha _t^m|$ using a Cramér type condition. This task can be accomplished for medium frequencies. Indeed, for $|t|=o( N)$ the quadratic part $itN^{-3/2}\sum _{j=m+1}^N\psi (X_1,Y_j)$ can be neglected and Cramér’s condition implies $|\alpha _t|\le 1-v'$ for some $v'>0$. This leads to an exponential bound $|\alpha _t^m|\le e^{-mv'}$. For large frequencies $|t|\approx N$, the contribution of the quadratic part becomes significant. To upper bound $|\alpha _t^m|$ we use condition (4). We show that, for a large set of values $t\in J_p$, see (51), Cramér’s condition (7) yields the desired decay of $|\alpha _t^m|$, while the measure of the set of remaining t is small with high probability.

Step 1. Truncation. Recall that the random variable $X:\Omega \rightarrow \mathcal X$ is defined on the probability space $(\Omega , P)$. Let $X'$ be an independent copy so that $(X,X')$ is defined on $(\Omega \times \Omega ', P\times P)$, where $\Omega '=\Omega $. It follows from $\mathbf{E}|\psi (X,X')|^r<\infty $, by Fubini, that for P almost all $\omega '\in \Omega '$ the function $\psi (\cdot , X'(\omega '))=\{x\rightarrow \psi (x,X'(\omega ')), \, x\in \mathcal X\}$ is an element of $L^r$. Furthermore, one can define an $L^r$-valued random variable $Z':\Omega '\rightarrow L^r$ such that $Z'(\omega ')=\psi (\cdot , X'(\omega '))$, for P almost all $\omega '$. Consider the event ${\tilde{\Omega }}=\{\Vert Z'\Vert _r\le N^{\alpha }\}\subset \Omega '$ and denote $q_N=P({\tilde{\Omega }})$. Here $\Vert Z'\Vert _r=(\int |\psi (x,X'(w'))|^rP_X(dx))^{1/r}$ denotes the $L^r$ norm of the random vector $Z'$ and $\alpha $ is defined in (28). Let $Y:{\tilde{\Omega }}\rightarrow \mathcal X$ denote the random variable $X'$ conditioned on the event ${\tilde{\Omega }}$. Therefore Y is defined on the probability space $({\tilde{\Omega }}, {\tilde{P}})$, where ${\tilde{P}}$ denotes the restriction of $q_N^{-1}P$ to the set ${\tilde{\Omega }}$ and, for every $\omega '\in {\tilde{\Omega }}$, we have $Y(\omega ')=X'(\omega ')$. Let Z denote the $L^r-$ valued random element $\{x\rightarrow \psi (x,\, Y(\omega '))\}$ defined on the probability space $({\tilde{\Omega }}, {\tilde{P}})$.

We can assume that ${\mathbb X}:=(X_1,\dots , X_N)$ is a sequence of independent copies of X defined on the probability space $(\Omega ^N, P^N)$. Let ${\overline{\omega }}=(\omega _1,\dots , \omega _N)$ denote an element of $\Omega ^N$. Every $X_j$ defines random vector $Z_j'=\psi (\cdot , X_j)$ taking values in $L^r$. Introduce events $A_j:=\{\Vert Z_j'\Vert _r\le N^{\alpha }\}\subset \Omega ^N$ and let ${\mathbb X}'=(X_1,\dots , X_m,Y_{m+1},\dots , Y_N)$ denote the sequence ${\mathbb X}$ conditioned on the event $\Omega ^*=\cap _{j=m+1}^NA_j=\Omega ^{m}\times {\tilde{\Omega }}^{N-m}$. Clearly, ${\mathbb X}'({\overline{\omega }})={\mathbb X}({\overline{\omega }})$ for every ${\overline{\omega }}\in \Omega ^*$ and ${\mathbb X}'$ is defined on the space $\Omega ^{m}\times {\tilde{\Omega }}^{N-m}$ equipped with the probability measure $P^m\times {\tilde{P}}^{N-m}$. In particular, the random variables $X_1,\dots , X_m, Y_{m+1},\dots , Y_{N}$ are independent and $Y_j$, for $m+1\le j\le N$, has the same distribution as Y. Let $Z_j$ denote the $L^r-$ valued random element $\{x\rightarrow \psi (x,Y_j),\, x\in \mathcal X\}$, for $m+1\le j\le N$. Let

$$\begin{aligned} T':={\tilde{\mathbb T}}(X_1,\dots , X_m,Y_{m+1},\dots , Y_N). \end{aligned}$$

(42)

We are going to replace $\mathbf{E}e^{it{\tilde{\mathbb T}}}$ by $\mathbf{E}e^{itT'}$. For $s>0$ we have almost surely

$$\begin{aligned} 1-{\mathbb I}_{A_j}\le N^{-\alpha \,s}\Vert Z_j'\Vert _r^{s}, \qquad \Vert Z_j'\Vert _r^r=\mathbf{E}\bigl ( |\psi (X,X_j)|^r \bigl |\, X_j\bigr ). \end{aligned}$$

(43)

From (43) with $s=r$ we obtain, by Chebyshev’s inequality, that

$$\begin{aligned} 0 \le 1-q_N \le N^{-r\alpha }\mathbf{E}|\psi (X,X_j)|^r \le N^{-r\alpha }M_* \le c_* N^{-2-3\nu }. \end{aligned}$$

(44)

Consequently, for $k\le N$ we have

$$\begin{aligned} q_N^{-k}\le & {} (1-N^{-r\, \alpha }M_*)^{-k} \le (1-N^{-2}M_*)^{-N} \le c_*, \nonumber \\ q_N^{-k}-1\le & {} kq_N^{-k}(1-q_N)\le c_*kN^{-2-3\nu }\le c_*N^{-1-3\nu }. \end{aligned}$$

(45)

Using the identity, which holds for a measurable function $f:\mathcal X^N\rightarrow {\mathbb R}$,

$$\begin{aligned} \mathbf{E}f(X_1,\dots , X_m, Y_{m+1},\dots , Y_N) = \mathbf{E}f(X_1,\dots , X_N) \frac{ {\mathbb I}_{A_{m+1}}\dots {\mathbb I}_{A_N} }{q_N^{(N-m)}} \end{aligned}$$

(46)

we obtain from (45) and (46) for $f\ge 0$ that

$$\begin{aligned} \mathbf{E}f(X_1,\dots , X_m, Y_{m+1},\dots , Y_N) \le c_*\mathbf{E}f(X_1,\dots , X_N). \end{aligned}$$

(47)

Furthermore, (45) and (46) imply

$$\begin{aligned} |\mathbf{E}e^{it(T'-x)}-\mathbf{E}e^{it({\tilde{\mathbb T}}-x)}|\le & {} \bigl (q_N^{-(N-m)}-1\bigr ) + \bigl (1-\mathbf{P}\{ A_{m+1}\cap \dots \cap A_N \}\bigr ) \nonumber \\= & {} (q_N^{-(N-m)}-1)+(1-q_N^{N-m}) \le c_*N^{-1-3\nu }. \end{aligned}$$

(48)

Now we can replace the integral in (27) by the integral

$$\begin{aligned} I := \int _{N^{1-\nu } \le |t|\le N^{1+\nu }} \mathbf{E}e^{it{\hat{T}}}v_N(t)dt, \qquad {\text {where}} \qquad v_N(t)=t^{-1} K_N(t), \qquad {\hat{T}}=T'-x. \end{aligned}$$

(49)

In view of (48) and the simple inequality $|K_N(t)|\le 1$ the error of this replacement is $c_*N^{-1-2\nu }$. Hence in order to prove (27) it remains to show that

$$\begin{aligned} |I| \le c_*\frac{1+\delta _3^{-1}}{N^{1+2\nu }}. \end{aligned}$$

(50)

Step 2. Here we prove (50). We split the integral

$$\begin{aligned} I= \sum _pI_p, \qquad I_p=\mathbf{E}\int _{t\in J_p} e^{it{\hat{T}}}v_N(t)dt, \end{aligned}$$

(51)

where $\{J_p,\, p=1,2,\dots \}$ is a sequence of consecutive intervals of length $\approx \delta _1N^{1-\nu }$ each and $\cup _pJ_p=[N^{1-\nu }, N^{1+\nu }]$. Recall that $\delta _1$ is defined in (37). To prove (50) we show that for every p

$$\begin{aligned} |I_p| \le c_*N^{-2}+c_*N^{-1-4\nu }\bigl (1+\delta _3^{-1}\bigr ). \end{aligned}$$

(52)

We fix p and prove (52). Firstly, we replace $I_p$ by $\mathbf{E}J_*$, where

$$\begin{aligned} J_*=\int {\mathbb I}_{\{t\in I_*\}}v_N(t)\mathbf{E}_{\mathbb Y}e^{it{\hat{T}}}dt \end{aligned}$$

and where $I_*=I_*(Y_{m+1}, \dots , Y_{N})\subset J_p$ is a random subset:

$$\begin{aligned} I_*=\{t\in J_p:\, |\alpha _t|^2>1-\varepsilon _m^2\}, \qquad \varepsilon _m^2=m^{-1}\ln ^{2}N. \end{aligned}$$

(53)

Note that for $t\in J_p\setminus I_*$, we have

$$\begin{aligned} |\mathbf{E}_{\mathbb Y}e^{itT'}| \le |\alpha _t|^m \le (1-\varepsilon _m^2)^{m/2} \le c_*N^{-3}. \end{aligned}$$

These inequalities imply the bound

$$\begin{aligned} |I_p-\mathbf{E}J_*|\le c_*N^{-2}. \end{aligned}$$

(54)

Secondly, we show that with a high probability the set $I_*\subset J_p$ is an interval. This fact and the fact that $v_N(t)$ is monotone will be used latter to bound the integral $J_*$. Introduce the $L^r-$ valued random element

$$\begin{aligned} S=N^{-1/2}(Z_{m+1}+\dots +Z_N)=N^{-1/2}\sum _{j=m+1}^N\psi (\cdot , Y_j). \end{aligned}$$

(55)

We apply Lemma 12 (see below) to the set $N^{-1/2}I_*$ conditionally given the event ${{\mathbb S}}=\{\Vert S\Vert _r<N^{\nu /10}\}$. This lemma shows that $N^{-1/2} I_*$ is an interval of size at most $c_*\varepsilon _m$. Hence we can write $I_*=(a_N,a_N+b_N^{-1})$ and

$$\begin{aligned} {\mathbb I}_{{\mathbb S}} J_*={\mathbb I}_{{\mathbb S}}\mathbf{E}_{\mathbb Y}{\tilde{J}}_*, \qquad {\tilde{J}}_*=\int _{a_N}^{a_N+b_N^{-1}}v_N(t)e^{it{\hat{T}}}dt, \end{aligned}$$

(56)

where the random variables $a_N,b_N$ (functions of $Y_{m+1},\dots , Y_N$) satisfy

$$\begin{aligned} a_N\in J_p \qquad {\text {and}} \qquad b_N^{-1}\le c_*\varepsilon _m\sqrt{N}=c_*\sqrt{N} m^{-1/2}\ln N. \end{aligned}$$

Furthermore, by Lemma 13 below we have $\mathbf{P}\{{\mathbb S}\}\ge 1-c_*N^{-3}$. Therefore,

$$\begin{aligned} |\mathbf{E}J_*-\mathbf{E}{\mathbb I}_{{\mathbb S}} J_*|\le c_*N^{-2}. \end{aligned}$$

(57)

Next, we observe that $I_*\not =\emptyset $ if and only if $ {\tilde{\alpha }}^2>1-\varepsilon _m^2$, where

$$\begin{aligned} {\tilde{\alpha }}=\sup \{|\alpha _t|:\, t\in J_p\}. \end{aligned}$$

Therefore we can write (56) in the form

$$\begin{aligned} {\mathbb I}_{{\mathbb S}}J_*={\mathbb I}_{\mathbb B}J_*={\mathbb I}_{\mathbb B}\mathbf{E}_{\mathbb Y}{\tilde{J}}_*, \qquad {\text {where}} \qquad \mathbb B=\{{\tilde{\alpha }}^2>1-\varepsilon _m^2\}\cap {\mathbb S}. \end{aligned}$$

This identity together with (54) and (57) imply

$$\begin{aligned} |I_p|\le |\mathbf{E}{\mathbb I}_{\mathbb B}\mathbf{E}_{\mathbb Y}{\tilde{J}}_*|+c_*N^{-2}. \end{aligned}$$

(58)

Using the integration by parts formula we shall show below that

$$\begin{aligned}&|\mathbf{E}{\mathbb I}_{\mathbb B}\mathbf{E}_{\mathbb Y}{\tilde{J}}_*| \le \frac{c}{N^{1-\nu }} \Bigl ( \mathbf{P}\{\mathbb B\} + \int _{b_N}^1\frac{\mathbf{P}\{\mathbb B_{\varepsilon }\}}{\varepsilon ^2}d\varepsilon \Bigr ), \qquad {\text {where}} \qquad \mathbb B_{\varepsilon }:=\mathbb B\cap \{|{\hat{T}}|\le \varepsilon \}.\nonumber \\ \end{aligned}$$

(59)

Moreover, we shall show that

$$\begin{aligned} \int _{b_N}^1\frac{\mathbf{P}\{\mathbb B_{\varepsilon }\}}{\varepsilon ^2}d\varepsilon \le c_*\frac{1+\delta _3^{-1}}{N^{5\nu }} \qquad {\text {and}} \qquad \mathbf{P}\{\mathbb B\} \le c_*\frac{1+\delta _3^{-1}}{N^{5\nu }}. \end{aligned}$$

(60)

The latter inequalities in combination with (58) and (59) yield (52). We prove (60) in Sect. 3.3.

Let us prove (59). Firstly, we show that

$$\begin{aligned} |{\tilde{J}}_*|\le c(|{\hat{T}}|+b_N)^{-1}a_N^{-1}. \end{aligned}$$

(61)

From the integration by parts formula we obtain the identity

$$\begin{aligned} i{\hat{T}}{\tilde{J}}_* = v_N(t)e^{it{\hat{T}}}\bigr |_{a_N}^{a_N+b_N^{-1}} -\int _{a_N}^{a_N+b_N^{-1}}v'_N(t)e^{it{\hat{T}}}dt =: a'-a''. \end{aligned}$$

(62)

By our choice of the smoothing kernel the function $v_N(t)$ is monotone on $J_p$. Therefore

$$\begin{aligned} |a''| \le \int _{a_N}^{a_N+b_N^{-1}}\left| v'_N(t)\right| dt = \left| \int _{a_N}^{a_N+b_N^{-1}}v'_N(t)dt \right| = \left| v_N(a_N)-v_N(a_N+b_N^{-1})\right| . \end{aligned}$$

Invoking the simple inequality $|a'|\le |v_N(a_N)|+|v_N(a_N+b_N^{-1})|$ and using $|v_N(t)|\le |t|^{-1}$ we obtain from (62) that

$$\begin{aligned} |{\hat{T}}{\tilde{J}}_*| \le c \, \bigl (a_N^{-1}+ (a_N+b_N^{-1})^{-1}\bigr ) \le c \, a_N^{-1}. \end{aligned}$$

For $|{\hat{T}}|>b_N$, this inequality implies (61). For $|{\hat{T}}|\le b_N$ the inequality (61) follows from the inequalities

$$\begin{aligned} |{\tilde{J}}_*| \le \int _{a_N}^{a_N+b_N^{-1}}|v_N(t)|dt \le \int _{a_N}^{a_N+b_N^{-1}}\frac{c}{|t|} dt \le c \, a_N^{-1}b_N^{-1}. \end{aligned}$$

The proof of (61) is complete. Now from (61) and the inequality $a_N \ge N^{1-\nu }$ we obtain that

$$\begin{aligned} |{\tilde{J}}_*|\le c(|{\hat{T}}|+b_N)^{-1}N^{-1+\nu }. \end{aligned}$$

Finally, using the inequality (which holds for arbitrary real number v)

$$\begin{aligned} \frac{1}{|v|+b_N} \le 2+2\int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}{\mathbb I}_{\{|v|\le \varepsilon \}} \end{aligned}$$

we show that

$$\begin{aligned} |{\tilde{J}}_*| \le \frac{c_*}{N^{1-\nu }} \left( 1 + \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}{\mathbb I}_{\{|{\hat{T}}|\le \varepsilon \}} \right) . \end{aligned}$$

The latter inequality implies (59).

3.3 Proof of (60)

The first and second inequality of (60) are proved in steps A and B.

Step A. Proof of the first inequality of (60). Recall ${\mathbb W}$ from (18). We split

$$\begin{aligned}&{\mathbb W}=W_1+W_2+W_3, \qquad W_1=\frac{1}{N^{1/2}}\sum _{j=m+1}^Ng(X_j), \\&W_2=\frac{1}{N^{3/2}}\sum _{m<i<j\le N}\psi (X_i,X_j), \qquad W_3=\sum _{|A|\ge 3:A\cap \Omega _m=\emptyset }T_A. \end{aligned}$$

Define $W_1', W_2', W_3'$ as $W_1, W_2, W_3$ above, but with $X_j$ replaced by $Y_j$, for $m+1\le j\le N$. We have $W'=W_1'+W_2'+W_3'$. Now we write ${\hat{T}}$ (see (49)) in the form ${\hat{T}}=L+\Delta +W_3'$, where

$$\begin{aligned}&L = \frac{1}{\sqrt{N}}\sum _{j=1}^m g(X_j) + \frac{1}{\sqrt{N}}\sum _{j=m+1}^N g(Y_j) -x, \nonumber \\&\Delta = \frac{1}{N^{3/2}}\sum _{j=1}^m\sum _{l=m+1}^N\psi (X_j,Y_l) + \frac{1}{N^{3/2}}\sum _{m+1\le j<l\le N}\psi (Y_j,Y_l). \end{aligned}$$

(63)

The inequalities $|{\hat{T}}|\le \varepsilon $ and $|L|\ge 2\varepsilon $ imply $|\Delta +W_3'|>\varepsilon $. Therefore,

$$\begin{aligned} \mathbf{P}\{\mathbb B_{\varepsilon }\} \le \mathbf{P}\{\mathbb B\cap \{|L|\le 2\varepsilon \}\, \} + \mathbf{P}\{|{\hat{T}}|\le \varepsilon ,\, |\Delta +W_3'|\ge \varepsilon \} =:I_1(\varepsilon )+I_2(\varepsilon ). \end{aligned}$$

To prove the first inequality of (60) we show that

$$\begin{aligned} \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}I_1(\varepsilon )\le c_*N^{-5\nu }(1+\delta _3^{-1}), \qquad \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}I_2(\varepsilon )\le c_*N^{-5\nu }. \end{aligned}$$

(64)

Step A.1. Proof of the second inequality of (64). We have

$$\begin{aligned} I_2(\varepsilon )\le \mathbf{P}\{|W_3'|\ge \varepsilon /2\}+I_3(\varepsilon ), \quad {\text {where}} \quad I_3(\varepsilon ):=\mathbf{P}\{|L+\Delta |<3\varepsilon /2, |\Delta |>\varepsilon /2\}. \end{aligned}$$

(65)

It follows from (47), by Chebyshev’s inequality, that $ \mathbf{P}\{|W_3'|>\varepsilon /2\}\le c_*\varepsilon ^{-2}\mathbf{E}W_3^2$. Furthermore, invoking the inequalities, see (167), (168) below,

$$\begin{aligned} \mathbf{E}W_3^2 = \sum _{|A|\ge 3: A\cap \Omega _m=\emptyset }\mathbf{E}T_A^2 \le \sum _{|A|\ge 3}\mathbf{E}T_A^2\le N^{-2}\Delta _3^2 \le c_*N^{-2} \end{aligned}$$

we obtain from (65) that $I_2(\varepsilon )\le I_3(\varepsilon )+c_*\varepsilon ^{-2}N^{-2}$. Since

$$\begin{aligned} \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}\Bigl (\frac{1}{\varepsilon ^2N^2}\Bigr ) \le c_* b_N^{-3}N^{-2} \le c_*N^{-5\nu }, \end{aligned}$$

it suffices to show inequality (64) for $I_3(\varepsilon )$ (instead of $I_2(\varepsilon )$). Recall the notation $\Lambda _1=N^{-3/2}\sum _{1\le i<j\le m}\psi (X_i,X_j)$ and put $U=\Lambda _1+\Delta $. We have

$$\begin{aligned} I_3(\varepsilon )\le \mathbf{P}\{|\Lambda _1|>\varepsilon /4\}+I_4(\varepsilon ), \quad {\text {where}} \quad I_4(\varepsilon ):=\mathbf{P}\{|L+U|<2\varepsilon ,\, |U|>\varepsilon /4\}. \end{aligned}$$

Invoking the inequality, which follows by Chebyshev’s inequality,

$$\begin{aligned} \mathbf{P}\{|\Lambda _1|>\varepsilon /4\} \le 16\varepsilon ^{-2}\mathbf{E}\Lambda _1^2 \le c_*\varepsilon ^{-2}m^2N^{-3} \end{aligned}$$

we upper bound the integral

$$\begin{aligned} \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}\mathbf{P}\{|\Lambda _1|>\varepsilon /4\} \le \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}\Bigl (\frac{m^2}{\varepsilon ^2N^3}\Bigr ) \le c_* b_N^{-3}m^2N^{-3} \le c_*N^{-5\nu }. \end{aligned}$$

Hence, it remains to show the second inequality of (64) for $I_4(\varepsilon )$.

Let $I_4'(\varepsilon )$ be the same probability as $I_4(\varepsilon )$ but with $X_i$ replaced by $Y_i$, for $1\le i\le m$. That is,

$$\begin{aligned}&I_4'(\varepsilon )=\mathbf{P}\{|L'+U'|<2\varepsilon ,\, |U'|>\varepsilon /4\}, \\&L'=\frac{1}{N^{1/2}}\sum _{1\le i\le N}g(Y_i)-x, \qquad U'=\frac{1}{N^{3/2}}\sum _{1\le i<j\le N}\psi (Y_i,Y_j). \end{aligned}$$

By the same reasoning as in (48) we obtain that $|I_4(\varepsilon )-I_4'(\varepsilon )|\le c_*N^{-1-3\nu }$. Now, in view of the bound

$$\begin{aligned} \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^2}N^{-1-3\nu }\le c_*b_N^{-1}N^{-1-3\nu }\le c_*N^{-5\nu } \end{aligned}$$

we conclude that it suffices to show (the second) inequality (64) for $I_4'(\varepsilon )$.

Let us show the second inequality of (64) for $I_4'(\varepsilon )$. We split the sample

$$\begin{aligned} {\mathbb Y} := \{Y_1,\dots , Y_N\} = {\mathbb Y}_1\cup {\mathbb Y}_2\cup {\mathbb Y}_3, \end{aligned}$$

into three groups of nearly equal size. Next, we split $U'=\sum _{i\le j}U'_{ij}$ so that the sum $U'_{ij}$ depends on the observations from the groups ${\mathbb Y}_i$ and ${\mathbb Y}_j$ only. We have

$$\begin{aligned} I_4'(\varepsilon ) \le \sum _{i\le j} \mathbf{P}\{|L'+U'|\le 2\varepsilon ,\,|U'_{ij}|\ge \varepsilon /24\}. \end{aligned}$$

(66)

Now we show that the second inequality of (64) holds for every summand in the right of (66). Let ${\tilde{U}}$ denote a summand $U'_{ij}$, say, not depending on ${\mathbb Y}_3$. Let

$$\begin{aligned}&{\tilde{I}}(\varepsilon ) := \mathbf{P}\{|L'+U'|\le 2\varepsilon ,\, |{\tilde{U}}|\ge \varepsilon /24\}, \qquad \mathcal{U}=\{|{\tilde{U}}|\ge \varepsilon /24\}, \qquad \\&\mathcal{V}=\{|L'+U'|\le 2\varepsilon \},\qquad {\overline{S}}(x) := N^{-1/2} \sum _{Y_i\in {\mathbb Y}\setminus {\mathbb Y}_3}\psi (x,Y_i), \qquad x\in \mathcal X. \end{aligned}$$

We observe that

$$\begin{aligned} {\tilde{I}}(\varepsilon ) = \mathbf{E}{\mathbb I}_\mathcal{U}{\mathbb I}_\mathcal{V} \end{aligned}$$

(67)

and note that the random function $ x \rightarrow {\overline{S}}(x)$ is a sum of iid random variables with values in $L^r$ such that, for every i, we have $\Vert \psi (\cdot ,Y_i)\Vert _r\le N^{\alpha }$ for almost all values of $Y_i$. By Lemma 13,

$$\begin{aligned} \mathbf{P}\{\Vert {\overline{S}}\Vert _r> N^{\nu }\}\le N^{-3}. \end{aligned}$$

Therefore in (67) we can replace the event $\mathcal V$ by $\mathcal{V}_1=\mathcal{V}\cap \{\Vert {\overline{S}}\Vert _r\le N^{\nu }\}$. Furthermore, since ${\tilde{U}}$ does not depend on ${\mathbb Y}_3$, we have $\mathbf{E}{\mathbb I}_\mathcal{U}{\mathbb I}_{\mathcal{V}_1} =\mathbf{E}{\mathbb I}_\mathcal{U}p'$, where $p':=\mathbf{E}\bigl ({\mathbb I}_{\mathcal{V}_1}|{\mathbb Y}_1, {\mathbb Y}_2\bigr )$. The concentration bound for the conditional probability $p'$, which is shown below,

$$\begin{aligned} p' \le c_*(\varepsilon +N^{-1/2}) \end{aligned}$$

(68)

implies

$$\begin{aligned} {\tilde{I}}(\varepsilon ) \le c_*(\varepsilon +N^{-1/2})\mathbf{P}\{\mathcal{U}\} \le c_*(\varepsilon +N^{-1/2})\varepsilon ^{-r}N^{-r/2}. \end{aligned}$$

(69)

In the last step we applied Markov’s inequality

$$\begin{aligned} \mathbf{P}\{\mathcal{U}\} \le (24/\varepsilon )^{r}N^{-r/2}\mathbf{E}|N^{1/2}{\tilde{U}}|^r \end{aligned}$$

and the bound $\mathbf{E}|N^{1/2}{\tilde{U}}|^r\le c_*\mathbf{E}|N^{1/2}U_{ij}|^r\le c_*$. Here $U_{ij}$ denotes the random variable obtained from ${\tilde{U}}$ after we replace $Y_j$ by $X_j$ for every j. The second last inequality follows from (47). The last inequality follows from the well known moment inequalities for U-statistics [20].

It follows from (69) and the simple inequality $\varepsilon \ge b_N\ge c_* N^{-1/2}$ that

$$\begin{aligned} \int _{b_N}^{1}\frac{d\varepsilon }{\varepsilon ^2}{\tilde{I}}(\varepsilon ) \le \frac{c_*}{N^{r/2}} \int _{b_N}^1\frac{d\varepsilon }{\varepsilon ^{1+r}} \le \frac{c_*}{N^{r/2}b_N^r} = c_*m^{-r/2}\ln ^rN \le c_* N^{-5\nu }, \end{aligned}$$

provided that $ m^{r/2}\ge N^{6\nu }$. The latter inequality is ensured by (17). Thus we have shown (64) for ${\tilde{I}}(\varepsilon )$.

It remains to prove (68). We write $L'+U'$ in the form $L_*+U_*+b-x$, where

$$\begin{aligned} L_* =\frac{1}{N^{1/2}}\sum _{Y_j\in {\mathbb Y}_3} \bigl (g(Y_j)+N^{-1/2}{\overline{S}}(Y_j)\bigr ) \qquad {\text {and}} \qquad U_*= \frac{1}{N^{3/2}}\sum _{\{Y_j,Y_k\}\subset {\mathbb Y}_3}\psi (Y_j,Y_k), \end{aligned}$$

and where b is a function of $\{Y_i\in {\mathbb Y}\setminus {\mathbb Y}_3\}$. Introduce the random variables ${\overline{L}}$ and ${\overline{U}}$ which are obtained from $L_*$ and $U_*$ after we replace every $Y_j\in {\mathbb Y}_3$ by the corresponding observation $X_j$. We have

$$\begin{aligned} p'&\le \sup _{v\in R} \mathbf{E}\bigl ( {\mathbb I}_{\{L_*+U_*\in [v,v+2\varepsilon ]\}} \bigl | {\mathbb Y}_1, {\mathbb Y}_2 \bigr ) {\mathbb I}_{\{\Vert {\overline{S}}\Vert _r\le N^{\nu }\}} \\&\le c_*\sup _{v\in R} \mathbf{E}\bigl ( {\mathbb I}_{\{{\overline{L}}+{\overline{U}}\in [v,v+2\varepsilon ]\}} \bigl | {\mathbb Y}_1, {\mathbb Y}_2 \bigr ) {\mathbb I}_{\{\Vert {\overline{S}}\Vert _r\le N^{\nu }\}}. \end{aligned}$$

In the last step we applied (47). Now an application of the Berry–Esseen bound due to van Zwet [37] shows (68). The proof of the second inequality of (64) is complete.

Step A.2. Proof of the first inequality of (64). We introduce events

$$\begin{aligned} \mathbb A=\{{\tilde{\alpha }}^2>1-\varepsilon _m^2\}, \quad \mathbb V=\{\Vert S\Vert _r\le N^{\nu }\}, \quad \mathbb L=\{|L|<2\varepsilon \} \end{aligned}$$

(recall that $\varepsilon _m$ is defined in (53)) and write $I_1(\varepsilon )$ in the form $I_1(\varepsilon ) =\mathbf{E}\, \mathbb I_{\mathbb A}\mathbb I_{{\mathbb S}}\mathbb I_{\mathbb L}$. We have

$$\begin{aligned} I_1(\varepsilon ) =\mathbf{E}\, \mathbb I_{\mathbb A}\mathbb I_{{\mathbb S}}\mathbb I_{\mathbb L} \le \mathbf{E}\, \mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\mathbb L}. \end{aligned}$$

To upper bound $I_1(\varepsilon )$ we use the following strategy. We can upper bound the probability $\mathbf{P}\{\mathbb L\}$ using the Berry–Esseen inequality,

$$\begin{aligned} \mathbf{P}\{\mathbb L\}\le c_*(\varepsilon +N^{-1/2}). \end{aligned}$$

(70)

Furthemore, one can show that the probability $\mathbf{P}\{\mathbb A\}=O(N^{-6\nu })$. We are going to make use of both of these bounds. However, since the events $\mathbb A$ and $\mathbb L$ refer to the same set of random variables $Y_{m+1},\dots , Y_N$, we cannot argue directly that $\mathbf{E}{\mathbb I}_{\mathbb A}{\mathbb I}_{\mathbb L}\approx \mathbf{P}\{\mathbb A\}\mathbf{P}\{\mathbb L\}$. Nevertheless, invoking a complex conditioning argument we are able to show that

$$\begin{aligned} I_1(\varepsilon )\le c_*\mathcal{R}(\varepsilon +N^{-1/2})+c_*N^{-2}, \qquad \mathcal{R}:=N^{-6\nu }(1+\delta _3^{-1}). \end{aligned}$$

(71)

The latter inequality together with the inequalities $\varepsilon \ge b_N>N^{-1/2}$ imply the first part of (64). Let us prove (71). As the proof is rather involved we start by providing an outline. Let the integers n and M be defined by (29). Split $\{1,\dots , N\}=O_0\cup O_1\cup \dots \cup O_n$, where $O_0=\{1,\dots , m\}$ and where the sets $O_i$, for $1\le i\le n$, are defined in (30). Split L, see (63),

$$\begin{aligned} L= \sum _{k=0}^nL_k-x, \quad \ \ {\text {where}} \quad \ \ L_k=N^{-1/2}\sum _{j\in O_k}g(Y_j), \quad \ \ {\text {for}} \quad \ \ k=1,\dots , n, \end{aligned}$$

(72)

and where $L_0=N^{-1/2}\sum _{j\in O_0}g(X_j)$. Observe, that $\mathbb I_{\mathbb L}$ is a function of $L_0, L_{1},\dots , L_n$. The random variables $\mathbb I_{\mathbb A}$ and $\mathbb I_{\mathbb V}$ are functions of $Y_{m+1},\dots , Y_N$ and do not depend on $X_1,\dots , X_m$. Therefore, denoting

$$\begin{aligned} m(l_1,\dots , l_n)=\mathbf{E}(\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}|L_{1}=l_{1},\dots , L_n=l_n) \qquad {\text {and}} \qquad \mathcal{M}=ess\sup m(l_{1},\dots , l_n) \end{aligned}$$

we obtain from (70)

$$\begin{aligned} \mathbf{E}\, \mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\mathbb L} = \mathbf{E}\, \mathbb I_{\mathbb L} m(L_1,\dots , L_n) \le c_*(\varepsilon +N^{-1/2})\mathcal{M}. \end{aligned}$$

(73)

Clearly, the bound $\mathcal{M}\le c_*\mathcal{R}$ would imply (71). Unfortunately, we are not able to establish such a bound directly. In what follows we prove (71) using a delicate conditioning which allows us to estimate quantities like $\mathcal{M}$.

Step A.2.1. Firstly we replace $L_k$, $1\le k\le n$, by smooth random variables

$$\begin{aligned} g_k=\frac{1}{N}\frac{\xi _k}{n^{1/2}}+L_k, \end{aligned}$$

(74)

where $\xi _{1},\dots , \xi _n$ are symmetric i.i.d. random variables with the density function defined by (22) with $k=6$ and $a=1/6$ so that the characteristic function $t\rightarrow \mathbf{E}\exp \{it\xi _1\}$ vanishes outside the unit interval $\{t:\, |t|<1\}$. Note that $\mathbf{E}\xi _1^4<\infty $. We assume that the sequences $\xi _1,\, \xi _2, \dots $ and $X_1,\dots , X_m,Y_{m+1},\dots , Y_N$ are independent. In particular, $\xi _k$ and $L_k$ are independent.

Introduce the event

$$\begin{aligned} {\tilde{\mathbb L}} = \left\{ \left| L_0+\sum _{k=1}^ng_k-x \right| <3\varepsilon \right\} . \end{aligned}$$

Note that

$$\begin{aligned} \mathbb I_{\mathbb L} \le \mathbb I_{\tilde{\mathbb L}} + \mathbb I_{\{|\xi |\ge \varepsilon N\}}, \qquad {\text {where}} \qquad \xi =\frac{1}{n^{1/2}}\sum _{k=1}^n\xi _k. \end{aligned}$$

Using Markov’s inequality and the inequality $\mathbf{E}\xi ^4\le c$ we estimate the probability

$$\begin{aligned} \mathbf{P}\{|\xi |\ge \varepsilon N\} \le \frac{\mathbf{E}\xi ^4}{\varepsilon ^4 N^4} \le \frac{c}{\varepsilon ^4N^4} \le \frac{c_*}{N^2}, \end{aligned}$$

where in the last step we used $\varepsilon ^2N\ge b_N^2N\ge c'_*$. Hence we have

$$\begin{aligned} \mathbf{E}\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\mathbb L} \le \mathbf{E}\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_ {\tilde{\mathbb L}}+c_*N^{-2}. \end{aligned}$$

(75)

In the subsequent steps of the proof we replace the conditioning on $L_1,\dots , L_n$ (in (73)) by the conditioning on the random variables $g_1,\dots , g_n$. Since the latter random variables have densities (their densities are analysed in Lemma 7 below) the corresponding conditional distributions are much easier to handle. Moreover, we restrict the conditioning on the event where these densities are positive.

Step A.2.2. Given $w>0$, consider the events $\{|g_k|\le n^{-1/2}w\}$ and their indicator functions $\mathbb I_k=\mathbb I_{\{|g_k|\le n^{-1/2}w\}}$. Using the simple inequality $n\mathbf{E}g^2_k\le c_*$ (where $c_*$ depends on $M_*$ and r) we obtain from Chebyshev’s inequality that

$$\begin{aligned} \mathbf{P}\{\mathbb I_k=1\} = 1-\mathbf{P}\{|g_k|>n^{-1/2}w\} \ge 1-w^{-2}n\mathbf{E}|g_k|^2>7/8, \end{aligned}$$

(76)

where the last inequality holds for a sufficiently large constant w (depending on $M_*,\, r$). Fix w such that (76) holds and introduce the event $\mathbb B^*=\{\sum _{k=1}^{n}\mathbb I_k> n/4\}$. Hoeffding’s inequality shows $\mathbf{P}\{\mathbb B^*\}\ge 1-\exp \{-n/8\}$. Therefore,

$$\begin{aligned} \mathbf{E}\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\tilde{\mathbb L}} \le \mathbf{E}\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\tilde{\mathbb L}}\mathbb I_{\mathbb B^*}+c_*N^{-2}. \end{aligned}$$

(77)

Given a binary vector $\theta =(\theta _1,\dots ,\theta _{n})$ (with $\theta _k\in \{0;1\}$) write $|\theta |=\sum _k\theta _k$. Introduce the event $\mathbb B_\theta =\{\mathbb I_k=\theta _k, \, 1\le k\le n\}$ and the conditional expectation

$$\begin{aligned} m_{\theta }(z_1,\dots ,z_{n}) = \mathbf{E}(\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\mathbb B_{\theta }}\,|\,g_1=z_1,\dots , g_{n}=z_{n}), \qquad (z_1,\dots , z_n)\in \mathbb R^n. \end{aligned}$$

Note that $\mathbb I_{\mathbb B_{\theta }}$, the indicator of the event $\mathbb B_{\theta }$, is a function of $g_1,\dots , g_{n}$. It follows from the identities

$$\begin{aligned} \mathbb B^*=\cup _{|\theta |> n/4}\mathbb B_{\theta } \qquad {\text {and}} \qquad \mathbb I_{\mathbb B^*}=\sum _{|\theta |> n/4}\mathbb I_{\mathbb B_{\theta }} \end{aligned}$$

(here $\mathbb B_{\theta }\cap \mathbb B_{\theta '}=\emptyset $, for $\theta \not =\theta '$) that

$$\begin{aligned} \mathbf{E}\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\tilde{\mathbb L}}\mathbb I_{\mathbb B^*} = \sum _{|\theta |> n/4}\mathbf{E}\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\tilde{\mathbb L}}\mathbb I_{\mathbb B_{\theta }} = \sum _{|\theta |> n/4}\mathbf{E}\mathbb I_{\mathbb B_\theta }\mathbb I_{\tilde{\mathbb L}}m_{\theta }(g_1,\dots , g_{n}). \end{aligned}$$

We shall show below that uniformly in $\theta $, satisfying $|\theta |> n/4$, we have

$$\begin{aligned} M_{\theta }\le c_*\mathcal{R}, \qquad {\text {where}} \qquad M_{\theta } := {\text {ess sup}} \ m_{\theta }(z_1,\dots , z_{n}). \end{aligned}$$

(78)

This bound in combination with (70), which extends to ${\tilde{\mathbb L}}$ as well, implies

$$\begin{aligned} \mathbf{E}\mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\tilde{\mathbb L}}\mathbb I_{\mathbb B^*}&\le c_*\mathcal{R}\sum _{|\theta |> n/4}\mathbf{E}\mathbb I_{\mathbb B_\theta }\mathbb I_{\tilde{\mathbb L}} = c_*\mathcal{R}\mathbf{E}\mathbb I_{\mathbb B^*}\mathbb I_{\tilde{\mathbb L}} \\&\le c_*\mathcal{R} \mathbf{P}\{\tilde{\mathbb L}\} \le c_*\mathcal{R}(\varepsilon +N^{-1/2}). \end{aligned}$$

Combining the latter inequalities with (75) and (77) we obtain (71).

Step A.2.3. Here we show (78). Fix $\theta =(\theta _1,\dots , \theta _{n})$ satisfying $|\theta |> n/4$. Denote, for brevity, $h=|\theta |$ and assume without loss of generality that $\theta _i=1$, for $1\le i\le h$, and $\theta _j=0$, for $h+1\le j\le n$. Consider the $h-$dimensional random vector ${\overline{g}}_{[\theta ]}=(g_1,\dots , g_h)$. Note that the random vector ${\overline{g}}_{[\theta ]}$ and the sequences of random variables

$$\begin{aligned} {\mathbb Y}_{\theta } = \bigl \{ Y_j:\, m+hM<j\le N \bigr \}, \qquad \xi _{\theta }=\{\xi _j:\, h<j\le n\} \end{aligned}$$

are independent. Recall S from (55) and note that the terms $S_{\theta }$ and $S'_{\theta }$ of the decomposition

$$\begin{aligned} S=S_{\theta }+S'_{\theta }, \qquad S_{\theta }(\cdot )=\frac{1}{\sqrt{N}}\sum _{1\le k\le h} \sum _{j\in O_k}\psi (\cdot ,Y_j) \end{aligned}$$

are independent as well.

For ${\overline{z}}_{[\theta ]} = (z_1,\dots , z_h)\in {\mathbb R}^h$ we have $ m_{\theta }(z_1,\dots , z_n) \le {\tilde{m}}_{\theta }({\overline{z}}_{[\theta ]})$, where

$$\begin{aligned} {\tilde{m}}_{\theta }({\overline{z}}_{[\theta ]}) = {\text {ess sup}}_{\theta } \mathbf{E}\bigl ( \mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\mathbb B_{\theta }} \, \bigr | \, {\overline{g}}_{[\theta ]}={\overline{z}}_{[\theta ]}, \, {\mathbb Y}_{\theta }, \, \xi _{\theta } \bigr ) \end{aligned}$$

denotes the "ess sup" taken with respect to almost all values of ${\mathbb Y}_{\theta }$ and $\xi _{\theta }$. To prove (78) we show that

$$\begin{aligned} {\tilde{m}}_{\theta }({\overline{z}}_{[\theta ]}) \le c_*\mathcal{R}. \end{aligned}$$

(79)

Let us prove (79). Given ${\mathbb Y}_{\theta }$, denote $f_{\theta }=S'_{\theta }$ (note that $S'_{\theta }$ is a function of $\mathbb Y_{\theta }$). Using the notation (40), we have for the interval $J'_p=N^{-1/2}J_p$,

$$\begin{aligned} \mathbf{E}\bigl ( \mathbb I_{\mathbb A}\mathbb I_{\mathbb V}\mathbb I_{\mathbb B_{\theta }} \, \bigr | \, {\overline{g}}_{[\theta ]}={\overline{z}}_{[\theta ]}, \, {\mathbb Y}_{\theta }, \, \xi _{\theta } \bigr ) = {\mathbb I}_{{\mathbb B}_{\theta }} \mathbf{E}\bigl ( d_{\varepsilon _m}(f_{\theta }+S_{\theta }, J'_p)\, \bigr | \, {\overline{g}}_{[\theta ]}={\overline{z}}_{[\theta ]},\, \mathbb Y_{\theta }, \,\xi _{\theta } \bigr ). \end{aligned}$$

(80)

Note that the factor ${\mathbb I}_{{\mathbb B}_{\theta }}$ in the right side is non zero whenever ${\overline{z}}_{[\theta ]}=(z_1,\dots , z_h)$ satisfies $|z_i|\le w/\sqrt{n}$, for $i=1,\dots , h$. Introduce $L^r$ valued random variables

$$\begin{aligned} U_i=N^{-1/2}\sum _{j\in O_{i}}\psi (\cdot ,Y_j), \qquad i=1,\dots , h, \end{aligned}$$

and the regular conditional probability

$$\begin{aligned} P({\overline{z}}_{[\theta ]};{\mathcal{A}}) = \mathbf{E}\bigl ( {\mathbb I}_{\{(U_{1},\dots , U_h)\in {\mathcal{A}}\}} \, \bigr | \, {\overline{g}}_{[\theta ]}={\overline{z}}_{[\theta ]} \bigr ). \end{aligned}$$

Here $\mathcal{A}$ denotes a Borel subset of $L^r\times \dots \times L^r$ (h-times). By independence, there exist regular conditional probabilities

$$\begin{aligned} P_i(z_i;\,{\mathcal{A}}_i) = \mathbf{E}({\mathbb I}_{U_i\in {\mathcal{A}}_i}\,\bigr |\, g_i=z_i), \qquad i=1,\dots , h, \end{aligned}$$

(81)

such that for Borel subsets ${\mathcal{A}}_i$ of $L^r$ we have

$$\begin{aligned} P({\overline{z}}_{[\theta ]}; {\mathcal{A}}_1\times \cdots \times {\mathcal{A}}_h) = \prod _{1\le i\le h}P_i(z_i; {\mathcal{A}}_i). \end{aligned}$$

In particular, for every ${\overline{z}}_{[\theta ]}$, the regular conditional probability $P({\overline{z}}_{[\theta ]};\cdot )$ is the (measure theoretical) extension of the product of the regular conditional probabilities (81). Therefore, denoting by $\psi _i$ a random variable with values in $L^r$ and with the distribution

$$\begin{aligned} \mathbf{P}\{\psi _i\in \mathcal{B} \}=P_i(z_i;\mathcal{B}), \qquad \mathcal{B}\subset L^r - {\text {Borel set}}, \end{aligned}$$

(82)

we obtain that the distribution of the sum

$$\begin{aligned} \zeta =\psi _1+\dots +\psi _h \end{aligned}$$

(83)

of independent random variables $\psi _1,\dots , \psi _h$ is the regular conditional distribution of $S_{\theta }$, given ${\overline{g}}_{[\theta ]}={\overline{z}}_{[\theta ]}$. In particular, the expectation in the right side of (80) equals $\delta _{\varepsilon _m}(f_{\theta }+\zeta )$, where

$$\begin{aligned} \delta _s(f_{\theta }+\zeta ) := \mathbf{E}_\zeta d_s(f_{\theta }+\zeta , J_p'), \qquad s>0, \end{aligned}$$

(84)

and where $\mathbf{E}_\zeta $ denotes the conditional expectation given all the random variables, but $\zeta $. Recall $\varepsilon _m$ defined by (53) and note that for any $\varepsilon _*$ satisfying the inequality

$$\begin{aligned} \varepsilon _m\le \varepsilon _* \end{aligned}$$

(85)

we have

$$\begin{aligned} \delta _{\varepsilon _m}(f_{\theta }+\zeta )\le \delta _{\varepsilon _*}(f_{\theta }+\zeta ). \end{aligned}$$

(86)

We put $\varepsilon _*=\mu _* |T_0| N^{-1/2}/20$ and apply Lemma 1 to upper bound $\delta _{\varepsilon _*}(f_{\theta }+\zeta )$ (the quantity $\mu _*$ is defined in (97) below). We will use the inequalities $c_*\delta _3^2/n\le \mu _*^2\le c_*'\delta _3^2/n$ that follow from (217) below. Note that for $T_0$ satisfying (38), integers m, n as in (17), (29), and the quantity $\delta _3$ (see (41)) satisfying

$$\begin{aligned} \delta _3^2\ge N^{-8\nu }, \end{aligned}$$

(87)

the inequality (85) holds, provided that N is sufficiently large ($N>C_*$). Moreover, we have

$$\begin{aligned} \varepsilon _*^2\le c_*\delta _3^2N^{-48\nu }. \end{aligned}$$

(88)

Now Lemma 1 (together with the moment inequalities of Lemma 10) implies the inequality

$$\begin{aligned} \delta _{\varepsilon _*}(f_{\theta }+\zeta ) \le c_*\kappa _*^{1/2}\varepsilon _*^{(r-2)/(2r)}+c_*N^{-2}, \end{aligned}$$

(89)

where the number $\kappa _*$, defined in (97), satisfies $\kappa _*\le c_*\delta _3^{-r/(r-2)}$, by (218).

Denote ${\tilde{r}}=r^{-1}+(r-2)^{-1}$. It follows from (89), (88) and (86), for $r>4$, that

$$\begin{aligned} \delta _{\varepsilon _m}(f_{\theta }+\zeta ) \le c_*\delta _3^{-{\tilde{r}}}N^{-6\nu }+c_*N^{-2} \le c_*(1+\delta _3^{-{\tilde{r}}})N^{-6\nu } \le c_*\mathcal{R}. \end{aligned}$$

(90)

In the last step we used the simple bound $\delta _3^2\le c_*$, see (200), and the inequality $1+\delta _3^{-{\tilde{r}}}\le 2+\delta _3^{-1}$, which holds for ${\tilde{r}}<1$. Note that (90) and (80), (84) imply (79). The proof the proof of the first inequality of (60) is complete.

Step B. Here we prove the second bound of (60). It is convenient to write the $L^r$-valued random variable (55) in the form

$$\begin{aligned} S=U_1+\dots +U_{n-1}+U_n=:S'+U_n, \qquad {\text {where}} \qquad U_i=N^{-1/2}\sum _{j\in O_i}\psi (\cdot ,Y_j). \end{aligned}$$

(91)

Observe that $U_1,\dots , U_{n-1}$ are independent and identically distributed. We are going to apply Lemma 1 conditionally, given $U_n$, to the probability

$$\begin{aligned} \mathbf{P}\{\mathbb B\}=\mathbf{E}{\tilde{p}}(U_n), \qquad {\text {where}} \qquad {\tilde{p}}(f)=\mathbf{E}\bigl (d_{\varepsilon _m}(S'+f,\,N^{-1/2}J_p)\bigl | U_n=f\bigr ). \end{aligned}$$

To upper bound ${\tilde{p}}(f)$ we proceed similarly as in the proof of (90). Lemma 9 shows that $U_1,\dots , U_{n-1}$ satisfy the moment conditions of Lemma 1. Note that in this case the quantity $\mu _*$ satisfies $c_*\delta _3^2/n\le \mu ^2_*\le c_*'/n$ (these inequalities follow from (201)). The right inequality implies the bound $\varepsilon _*\le c_*N^{-48\nu }$ instead of (88) above. As a result we obtain a different power of $\delta _3$ in the upper bound below. Proceeding as in proof of (90), see (86), (88), (89), we obtain

$$\begin{aligned} {\tilde{p}}(f)\le c_*(1+\delta _3^{-r/2(r-2)})N^{-6\nu } \le c_*\mathcal{R}. \end{aligned}$$

In the last step we used the inequality $1+\delta _3^{-r/2(r-2)}\le 2+\delta _3^{-1}$, which follows from $r/2(r-2)< 1$ (recall that $r>4$). Therefore, we have $\mathbf{P}\{\mathbb B\}\le \mathbf{E}{\tilde{p}}(U_n)\le c_*\mathcal{R}$, where $\mathcal{R}$ is defined in (71). This completes the proof of the second inequality in (60).

3.4 Proof of (26) for $k = 3$

Here we prove the bound $|I_3|\le c_*N^{-1-\nu }$, see (26). It follows from (48) and the identity $\mathbf{E}_{\mathbb Y}\exp \{itT'\}=\alpha _t^m\exp \{itW'\}$, see (13), that

$$\begin{aligned} |I_3|\le \int _{t_1<|t|<t_2}\frac{\mathbf{E}|\alpha _t^m|}{|t|}dt+c_*N^{-1-\nu }. \end{aligned}$$

(92)

Recall the event ${\mathbb S}=\{\Vert S\Vert _r<N^{\nu /10}\}$, where S defined in (55). We have

$$\begin{aligned} \mathbf{E}|\alpha _t^m| \le \mathbf{E}{\mathbb I}_{{\mathbb S}}|\alpha _t^m| +\mathbf{E}(1-{\mathbb I}_{\mathbb S}). \end{aligned}$$

(93)

Using Lemma 13 we upper bound the second term on the right: $\mathbf{P}\{\Vert S\Vert _r\ge N^{\nu /10}\}\le c_*N^{-3}$. Furthermore, the one-term expansion of the exponent in (13) in powers of $itN^{-3/2}\sum _{j=m+1}^N\psi (X_1,Y_j)$ shows the inequality

$$\begin{aligned} {\mathbb I}_{\mathbb S}|\alpha _t| \le |\mathbf{E}\exp \{itN^{-1/2}g(X_1)\}|+ {\mathbb I}_{\mathbb S}|t|N^{-1}\Vert S\Vert _1. \end{aligned}$$

It follows from (7) that the first summand is bounded from above by $1-v$, for some $v>0$ depending on $A_*,M_*,D_*, \delta $ only, see the proof of (36). Furthermore, the second summand is bounded from above by $N^{-9\nu /10}$ almost surely. Therefore, for sufficiently large $N>C_*$ we have ${\mathbb I}_{\mathbb S}|\alpha _t|\le 1-v/2$ uniformly in N. Invoking this bound in (93) we obtain

$$\begin{aligned} \mathbf{E}|\alpha _t^m|\le (1-v/2)^m+c_*N^{-3}\le c_*N^{-3}, \end{aligned}$$

for m satisfying (17). The latter inequality implies that the integral in (92) is bounded from above by $c_*N^{-2}$ thus completing the proof.

4 Combinatorial concentration bound

We start the section by introducing some notation and collecting auxiliary inequalities. Then we formulate and prove Lemmas 1 and 2.

Introduce the number

$$\begin{aligned} \delta _2 = \min \left\{ \frac{1}{12c_g}, \frac{(c_r\Vert g\Vert _2^2/2^r\Vert g\Vert _r^r)^{1/(r-2)}}{1+4/\Vert g\Vert _2} \right\} , \end{aligned}$$

(94)

where $c_g=1+\Vert g\Vert _r/\Vert g\Vert _2$ and $c_r=(7/24)2^{-(r-1)}$. Denote

$$\begin{aligned} \rho ^*=1-\sup \{|\mathbf{E}e^{itg(X_1)}|: 2^{-1}\delta _2\le |t|\le N^{-\nu +1/2}\}. \end{aligned}$$

It follows from the identity $\rho ^* = \rho (2^{-1}\sigma \delta _2,\, \sigma N^{-\nu +1/2})$ and the simple inequality $a_1\le \delta _2/4$, see (35), that $\rho ^* \ge \rho (2 \sigma \,a_1, \sigma N^{-\nu +1/2})$. Furthermore, it follows from (169) and the assumption $\sigma _\mathbb T^2=1$ that $1/2<\sigma <2$ for sufficiently large N ($N>C_*$). Therefore, $\rho ^* \ge \rho (a_1,2N^{-\nu +1/2}) \ge \delta '$, where the last inequality follows from (36). We obtain, for $N>C_*$,

$$\begin{aligned} 1-\sup \{|\mathbf{E}e^{itg(X_1)}|: 2^{-1}\delta _2\le |t|\le N^{-\nu +1/2}\} \ge \delta ', \end{aligned}$$

(95)

where the number $\delta '$ depends on $A_*,D_*,M_*,\nu _1, r,s,\delta $ only. In what follows we use the notation $c_0=10$. Let $L_0^r= \{y\in L^r: \int _\mathcal{X} y(x)P_X(dx)=0\}$ denotes a subspace of $L^r$. Observe, that $\mathbf{E}g(X_1)=0$ implies $y^*(=p_g(y))\in L_0^r$, for every $y\in L_0^r$.

4.1 Lemma 1

Let $\psi _1,\dots , \psi _n$ denote independent random vectors with values in $L_0^r$. For $k=1,\dots , n$, write

$$\begin{aligned} \zeta _k = \psi _1+\dots +\psi _k \qquad {\text {and}} \qquad \zeta =\zeta _n. \end{aligned}$$

Let $\overline{\psi }_i$ denote an independent copy of $\psi _i$. Write $\psi _i^*=p_g(\psi _i)$ and $\overline{\psi }_i^*=p_g({\overline{\psi }}_i)$, see (31). Introduce random vectors

$$\begin{aligned} \tilde{\psi }_i = 2^{-1}(\psi _i-\overline{\psi }_i), \qquad \tilde{\psi }_i^* = 2^{-1}(\psi _i^*-\overline{\psi }_i^*), \qquad \hat{\psi }_i = 2^{-1}(\psi _i+\overline{\psi }_i). \end{aligned}$$

We shall assume that, for some $c_A\ge c_D\ge c_B>0$,

$$\begin{aligned} n^{r/2}\mathbf{E}\Vert {\tilde{\psi }}_i\Vert _r^r\le c_A^r, \qquad c_B^2 \le n\, \mathbf{E}\Vert {\tilde{\psi }}^*_i\Vert _2^2 \le c_D^2, \end{aligned}$$

(96)

for every $1\le i\le n$. Furthermore, denote $\mu _i^2=\mathbf{E}\Vert {\tilde{\psi }}^*_i\Vert _2^2$ and ${\tilde{\kappa }}_i^{r-2}=\frac{8}{3}\frac{\mathbf{E}\Vert {\tilde{\psi }}_i\Vert _r^r}{\mu _i^r}$,

$$\begin{aligned} \mu _*=\min _{1\le i\le n}\mu _i, \qquad \kappa _*=\max _{1\le i\le n}{\tilde{\kappa }}_i. \end{aligned}$$

(97)

Observe that, by Hölder’s inequality and (32), we have ${\tilde{\kappa }}_i>1$, for $i=1,\dots , n$.

Lemma 1

Let $4<r\le 5$ and $0<\nu <10^{-2}(r-4)$. Assume that $n\ge N^{5\nu }$. Suppose that

$$\begin{aligned} \kappa _*^4\le \frac{9}{256}\frac{n}{\ln N}. \end{aligned}$$

(98)

Assume that (95), (96) as well as (106), (112) (below) hold. There exist a constant $c_*>0$ which depends on $r,s,\nu , A_*, D_*, M_*, \delta $ only such that for every $T_0$ satisfying (38) we have

$$\begin{aligned} \delta _{\varepsilon _*}(f+\zeta , I(T_0)) \le c_*(C_D/C_B)^{1/2}\kappa _*^{1/2}{\varepsilon _*}^{(r-2)/2r}+c_*N^{-2}, \end{aligned}$$

(99)

for an arbitrary non-random element $f\in L_0^r$. Here $ {\varepsilon _*}=\frac{\mu _*}{2c_0}\frac{|T_0|}{\sqrt{N}}$. The function $\delta _s(\cdot , I(T_0))$, is defined in (40).

In Step A.2.3 of Sect. 3 we apply this lemma to random vector $\zeta =\psi _1+\dots +\psi _h$, see (83). In Step B of Sect. 3 we apply this lemma to the random vector $S'$, see (91).

Proof

We shall consider the case where $T_0>0$. For $T_0<0$ the proof is the same. We can assume without loss of generality that $c_0<N^{\nu }$. Denote $X=\Vert \tilde{\psi }_i^*\Vert _2$ and $Y=\Vert \tilde{\psi }_i\Vert _r$ and $\mu =\mu _i$, $\kappa ={\tilde{\kappa }}_i$. By (32), we have $Y\ge X$.

Step 1. Here we construct the bound (100), see below, for the probability $\mathbf{P}\{B_i\}$, where

$$\begin{aligned} B_i=\{X\ge \mu /2,\, Y<\kappa \mu \}. \end{aligned}$$

Write

$$\begin{aligned}&\mu ^2=\mathbf{E}X^2 = \mathbf{E}X^2I_A+\mathbf{E}X^2I_{B_i}+\mathbf{E}X^2I_D, \\&A =\{X<\mu /2\}, \qquad D=\{X\ge \mu /2,\, Y\ge \kappa \mu \}. \end{aligned}$$

Substitution of the bounds

$$\begin{aligned} \mathbf{E}X^2I_A&\le \frac{\mu ^2}{4}, \\ \mathbf{E}X^2I_{B_i}&\le \mathbf{E}Y^2I_{B_i}\le (\kappa \mu )^2\mathbf{P}\{B_i\}, \\ \mathbf{E}X^2I_D&\le \mathbf{E}Y^2I_{\{Y\ge \kappa \mu \}} \le (\kappa \mu )^{2-r}\mathbf{E}Y^r \end{aligned}$$

gives

$$\begin{aligned} \mu ^2\le 4^{-1}\mu ^2+\kappa ^2\mu ^2\mathbf{P}\{B_i\}+(\kappa \mu )^{2-r}\mathbf{E}Y^r. \end{aligned}$$

Finally, invoking the identity $\kappa ^{r-2}=(8/3)\mathbf{E}Y^r/\mu ^r$ we obtain

$$\begin{aligned} \mathbf{P}\{B_i\} \ge \frac{3}{4\kappa ^2}-\frac{\mathbf{E}Y^r}{(\kappa \mu )^{r}} = \frac{3}{4\kappa ^2}\bigl (1-\frac{4\mathbf{E}Y^r}{3\mu ^r\kappa ^{r-2}}\bigr ) = \frac{3}{8\kappa ^2} \ge \frac{3}{8\kappa _*^2}=:p. \end{aligned}$$

(100)

Introduce the (random) set $J=\{i:\, B_i\ {\text {occurs}}\}\subset \{1,\dots , n\}$. Hoeffding’s inequality applied to the random variable $|J|=\mathbb I_{B_1}+\dots +\mathbb I_{B_n}$ shows

$$\begin{aligned} \mathbf{P}\{|J|\le \rho n\} \le \exp \{-np^2/2\} \le N^{-2}, \qquad \rho :=p/2=(3/16)\kappa _*^{-2}. \end{aligned}$$

(101)

In the last step we invoke (98) and use (100).

Step 2. Here we introduce randomization. Note that for any $\alpha _i\in \{-1, +1\}$, $i=1,\dots , n$, the distributions of the random vectors

$$\begin{aligned} (\psi _1,\dots , \psi _n) \qquad {\text { and}} \qquad \bigl ( \alpha _1\tilde{\psi }_1+\hat{\psi }_1, \dots , \alpha _n\tilde{\psi }_n+\hat{\psi }_n \bigr ) \end{aligned}$$

coincide. Therefore, denoting

$$\begin{aligned} {\tilde{\zeta }}_n=\alpha _1\tilde{\psi }_1+\dots +\alpha _n\tilde{\psi }_n, \qquad {\hat{\zeta }}_n={\hat{\psi }}_1+\dots +{\hat{\psi }}_n, \end{aligned}$$

we have for $s>0$,

$$\begin{aligned} \delta _{s}(f+ \zeta , I(T_0)) = \delta _{s}(f+{\tilde{\zeta }}_n+{\hat{\zeta }}_n, I(T_0)), \end{aligned}$$

for every choice of $\alpha _1,\dots , \alpha _n$. From now on let $\alpha _1,\dots , \alpha _n$ denote a sequence of independent identically distributed Bernoulli random variables independent of ${\tilde{\psi }}_i, {\hat{\psi }}_i$, $1\le i\le n$, and with probabilities $\mathbf{P}\{\alpha _1=1\}=\mathbf{P}\{\alpha _1=-1\}=1/2$. Denoting by $\mathbf{E}_\alpha $ the expectation with respect to the sequence $\alpha _1,\dots , \alpha _n$ we obtain

$$\begin{aligned} \delta _{s}(f+\zeta ,I(T_0)) = \mathbf{E}_{\alpha }\delta _{s}(f+{\tilde{\zeta }}_n+{\hat{\zeta }}_n,I(T_0)). \end{aligned}$$

(102)

We are going to condition on ${\tilde{\psi }}_i$ and ${\hat{\psi }}_i$, $1\le i\le n$, while taking expectations with respect to $\alpha _1,\dots , \alpha _n$. It follows from (101), (102) and the fact that the random variable |J| does not depend on $\alpha _1,\dots , \alpha _n$ that

$$\begin{aligned} \delta _{s}(f+\zeta , I(T_0)) \le \mathbf{E}\mathbb I_{\{|J|\ge \rho n\}} \gamma _s({\tilde{\psi }}_i,\, {\hat{\psi }}_i, \, 1\le i\le n)+N^{-2}, \end{aligned}$$

(103)

where

$$\begin{aligned} \gamma _s( {\tilde{\psi }}_i,\, {\hat{\psi }}_i, \, 1\le i\le n) = \mathbf{E}_{\alpha }\mathbb I_{\{|J|\ge \rho n\}} \mathbb I_{ \{ v^2(f+{\tilde{\zeta }}_n+{\hat{\zeta }}_n) > 1-s^2 \} } \mathbb I_{ \{ \Vert f+{\tilde{\zeta }}_n+{\hat{\zeta }}_n\Vert _r\le N^{\nu } \} } \end{aligned}$$

denotes the conditional expectation given ${\tilde{\psi }}_i,\, {\hat{\psi }}_i$, $1\le i\le n$. Note that (99) is a consequence of (103) and of the bound

$$\begin{aligned} \gamma _{\varepsilon _*}({\tilde{\psi }}_i,\, {\hat{\psi }}_i, \, 1\le i\le n) \le c_*\kappa _*^{1/2}\varepsilon _*^{(r-2)/(2r)}. \end{aligned}$$

(104)

Let us prove this bound. Introduce the integers

$$\begin{aligned} n_0=l-1 \qquad \qquad l=\lfloor \delta _2\varkappa ^{-1}\varepsilon _*^{-(r-2)/r}\rfloor , \qquad \qquad \varkappa =2c_0(C_D/C_B)\kappa _*. \end{aligned}$$

Let us show that

$$\begin{aligned} n_0\le \rho n. \end{aligned}$$

(105)

It follows from the inequalities

$$\begin{aligned} \varepsilon _*^{-1}\le 2\frac{c_0}{c_B}N^{ \nu }n^{1/2}, \quad N^{\nu (r-2)/r}\le N^{\nu }\le n^{1/r}, \quad \delta _2\le \frac{3}{16}\bigl (\frac{3}{8}\bigr )^{1/(r-2)} \end{aligned}$$

that

$$\begin{aligned} l \le \frac{\delta _2}{C_D}\frac{1}{k_*} \left( \frac{C_B}{2c_0}\right) ^{2/r} \left( N^{\nu }n^{1/2}\right) ^{(r-2)/r} \le \frac{3}{16}\frac{1}{k_*}\frac{C_B^{2/r}}{C_D}n^{1/2}. \end{aligned}$$

Note that (98) implies $k_*\le n^{1/4}$. Therefore, the inequality

$$\begin{aligned} C_B^{2/r}C_D^{-1}\le n^{1/4} \end{aligned}$$

(106)

implies $l\le (3/16)k_*^{-2}n=\rho n$. We obtain (105).

Given ${\tilde{\psi }}_i,\, {\hat{\psi }}_i$, $1\le i\le n$, consider the corresponding set J, say $J=\{i_1,\dots , i_k\}$. Assume that $k\ge \rho n$. From the inequality $\rho n\ge n_0$, see (105), it follows that we can choose a subset $J'\subset J$ of size $|J'|=n_0$. Split

$$\begin{aligned} {\tilde{\zeta }}_n = \sum _{i\in J'}\alpha _i{\tilde{\psi }}_i + \sum _{i\in J\setminus J'}\alpha _i{\tilde{\psi }}_i =:\zeta _*+\zeta ' \end{aligned}$$

and denote $f+\zeta '+{\hat{\zeta }}_n=f_*$. Note that $f_*\in L_0^r$ almost surely. Let

$$\begin{aligned} {\tilde{\delta }} = \mathbf{E}' \mathbb I_{ \{ v^2(f_*+\zeta _*) > 1-\varepsilon _*^2 \}} \mathbb I_{ \{ \Vert f_*+\zeta _*\Vert _r\le N^{\nu } \}}, \end{aligned}$$

where $\mathbf{E}'$ denotes the conditional expectation given all the random variables, but $\{\alpha _i,\, i\in J'\}$. The bound (104) would follow if we show that

$$\begin{aligned} {\tilde{\delta }} \le c_*\kappa _*^{1/2}\varepsilon _*^{(r-2)/(2r)}. \end{aligned}$$

(107)

Step 3. Here we prove (107). Note that for $j\in J'$ the vectors

$$\begin{aligned} x_j=T_0N^{-1/2}{\tilde{\psi }}_j \qquad {\text { and}} \qquad x_j^*=p_g(x_j)=T_0N^{-1/2}{\tilde{\psi }}_j^* \end{aligned}$$

satisfy

$$\begin{aligned} \Vert x_j^*\Vert _2\ge c_0\varepsilon _*, \qquad \Vert x_j\Vert _r\le \varkappa \varepsilon _*, \qquad \varkappa =2c_0(C_D/C_B)\kappa _*. \end{aligned}$$

(108)

Given $A\subset J'$ denote

$$\begin{aligned} x_A=\sum _{i\in A}x_i-\sum _{i\in J'\setminus A}x_i, \qquad x_A^*=p_g(x_A). \end{aligned}$$

We are going to apply Kleitman’s theorem on symmetric partitions (see, e.g. the proof of Theorem 4.2, Bollobas [15]) to the sequence $\{x_j^*,\, j\in J'\}$ in $L^2$. Since for $j\in J'$ we have $\Vert x_j^*\Vert _2\ge c_0\varepsilon _*$, it follows from Kleitman’s theorem that the collection $\mathcal{P}(J')$ of all subsets of $J'$ splits into non-intersecting non-empty classes $\mathcal{P}(J')=\mathcal{D}_1\cup \cdots \cup \mathcal{D}_s$, such that the corresponding sets of linear combinations $ V_t = \bigl \{ x^*_A,\, A\in \mathcal{D}_t \bigr \}$, $t=1,2,\dots , s$, are sparse, i.e., given t, for $A,A'\in \mathcal{D}_t$ and $A\not =A'$ we have

$$\begin{aligned} \Vert x_A^*-x_{A'}^*\Vert _2\ge c_0\varepsilon _*. \end{aligned}$$

(109)

Furthermore, the number of classes s is bounded from above by $\left( {\begin{array}{c}n_0\\ \lfloor n_0/2\rfloor \end{array}}\right) $.

Next, using Lemma 2 we shall show that given $f_*$ the class $\mathcal{D}_t$ may contain at most one element $A\in \mathcal{D}_t$ such that

$$\begin{aligned} v^2(f_*+{\tilde{x}}_A)> 1-\varepsilon _*^2, \qquad \Vert f_*+{\tilde{x}}_A\Vert _r\le N^{\nu }, \qquad {\tilde{x}}_A:=N^{1/2}T_0^{-1}x_A. \end{aligned}$$

(110)

This means that there are at most $\left( {\begin{array}{c}n_0\\ \lfloor n_0/2\rfloor \end{array}}\right) $ different subsets $A\subset J'$ for which (110) holds. This implies (107)

$$\begin{aligned} {\tilde{\delta }} \le 2^{-n_0}\left( {\begin{array}{c}n_0\\ \lfloor n_0/2\rfloor \end{array}}\right) \le cn_0^{-1/2} = c\delta _2^{-1/2} \varkappa ^{1/2}\varepsilon _*^{\frac{r-2}{2r}}. \end{aligned}$$

Finally, (99) follows from (103), (104), (107).

Given $f_*\in L_0^r$ let us show that there is no pair $A,\,A'$ in $\mathcal{D}_t$ which satisfy (110). Fix $A,A'\in \mathcal{D}_t$. We have, by (108) and the choice of $n_0$,

$$\begin{aligned} \Vert x_A-x_{A'}\Vert _r \le 2\sum _{i\in J'}\Vert x_i\Vert _r \le 2n_0\varkappa \varepsilon _*<2\delta _2\varepsilon _*^{2/r}. \end{aligned}$$

Denoting $S_{A}=f_*+{\tilde{x}}_A$ and $S_{A'}=f_*+{\tilde{x}}_{A'}$ we obtain

$$\begin{aligned} \Vert S_A-S_{A'}\Vert _r = N^{1/2}T_0^{-1}\Vert x_A-x_{A'}\Vert _r \le 2\delta _2\varepsilon _*^{2/r}N^{1/2}T_0^{-1}. \end{aligned}$$

(111)

Assume that $S_A$ and $S_{A'}$ satisfy the second inequality of (110), i.e., $\Vert S_A\Vert _r\le N^{\nu }$ and $\Vert S_{A'}\Vert _r\le N^{\nu }$. We are going to apply Lemma 2 to the vectors $S_A$ and $S_{A'}$. In order to check the conditions of Lemma 2 note that (114) and (115) are verified by (108), (109) and (111). Furthermore, the inequalities $c_0<N^{\nu }$ and

$$\begin{aligned} c_B\ge 2N^{4\nu }(n/N)^{1/2}, \end{aligned}$$

(112)

imply $ N^{2\nu -1/2}\le \varepsilon _*$. Finally, we can assume without loss of generality that $\varepsilon _*\le c_*'$, where $c_*':=\min \bigl \{( \delta '/4)^{r/2}, (A_*^{1/2}/6)^{r/2}\bigr \}$. Otherwise (99) follows from trivial inequalities

$$\begin{aligned} \delta _{\varepsilon _*} \le 1 \le (\varepsilon _*/c_*')^{(r-2)/2r} \le c_*\varepsilon _*^{(r-2)/2r} \end{aligned}$$

and the inequality $\kappa _*>1$.

Now Lemma 2 implies $\min \{v^2(S_A), \, v^2(S_{A'})\}\le 1-\varepsilon _*^2$ thus completing the proof of Lemma 1. $\square $

4.2 Lemma 2

Here we formulate and prove Lemma 2. Let us introduce first some notation. Given $y\in L^r(=L^r(\mathcal X,P_X))$ define the symmetrization $y_{s}\in L^r(\mathcal X\times \mathcal X,P_X\times P_X)$ by $y_{s}(x,x')=y(x)-y(x')$, for $x,x'\in \mathcal X$. In what follows $X_1,X_2$ denote independent random variables with values in $\mathcal X$ and with the common distribution $P_X$. By $\mathbf{E}$ we denote the expectation taken with respect to $P_X$. For $h\in L^r$ we write

$$\begin{aligned} \mathbf{E}h=\mathbf{E}h(X_1)=\int _\mathcal{X} h(x)P_X(dx), \qquad \mathbf{E}e^{ih}=\mathbf{E}e^{ih(X_1)}=\int _\mathcal{X} e^{ith(x)}P_X(dx). \end{aligned}$$

Furthermore, for $2\le p\le r$, denote

$$\begin{aligned} \Vert y_s\Vert _p^p=\mathbf{E}|y(X_1)-y(X_2)|^p, \qquad \Vert y\Vert _p^p=\mathbf{E}|y(X_1)|^p. \end{aligned}$$

Note that for $y\in L_0^r$ we have $y^*(=p_g(y))\in L_0^r$ and, therefore,

$$\begin{aligned} \mathbf{E}|y^*(X_1)-y^*(X_2)|^2=2\mathbf{E}|y^*(X_1)|^2. \end{aligned}$$

(113)

Let $y_1,\dots , y_k, f$ be non-random vectors in $L^r$. We shall assume that these vectors belong to the linear subspace $L_0^r$. Given non random vectors $\alpha =\{\alpha _i\}_{i=1}^k$ and $\alpha '=\{\alpha '_i\}_{i=1}^k$, with $\alpha _i, \alpha '_i\in \{-1, +1\}$, denote

$$\begin{aligned} S_{\alpha }=f+\sum _{i=1}^k\alpha _iy_i, \qquad S_{\alpha '}=f+\sum _{i=1}^k\alpha '_iy_i. \end{aligned}$$

Lemma 2

Let $\varkappa >0$. Assume that (95) holds and suppose that

$$\begin{aligned} N^{\nu -1/2} \le \varepsilon \le \min \bigl \{( \delta '/4)^{r/2}, (\Vert g\Vert _2/6)^{r/2} \bigr \}. \end{aligned}$$

Given $T_0$, satisfying (38), write $T^*=N^{1/2}T_0^{-1}$ and assume that

$$\begin{aligned} \Vert y_j^*\Vert _2>c_0T^*\varepsilon , \quad \Vert y_j\Vert _r\le \varkappa \, T^*\varepsilon , \quad j=1,\dots , k. \end{aligned}$$

(114)

Suppose that $\Vert S_{\alpha }\Vert _r\le N^{\nu }$ and $\Vert S_{\alpha '}\Vert _r\le N^{\nu }$ and

$$\begin{aligned} \Vert S^*_{\alpha }-S^*_{\alpha '}\Vert _2\ge c_0T^*\varepsilon , \qquad \Vert S_{\alpha }-S_{\alpha '}\Vert _r \le 2\delta _2T^*\varepsilon ^{2/r}. \end{aligned}$$

(115)

Then $\min \{v^2(S_{\alpha }),\, v^2(S_{\alpha '}\}\le 1-\varepsilon ^2$.

Recall that the functionals $v(\cdot ),\tau (\cdot )$, $u_t(\cdot )$ and the interval $I=I(T_0)$ used in proof below are defined in (39).

Proof

Note that $\delta _1<1/10$ and $\delta _2<1/12$. In particular, we have

$$\begin{aligned} 9/10\le 1-\delta _1\le |s/T_0|\le 1+\delta _1\le 11/10, \quad {\text {for}} \quad |s-T_0|<\delta _1N^{-\nu +1/2}. \end{aligned}$$

(116)

Step 1. Assume that the inequality $\min \{v^2(S_{\alpha }),\, v^2(S_{\alpha '}\}\le 1-\varepsilon ^2$ fails. Then for some $s,t\in I$ we have

$$\begin{aligned} 1-|u_t(S_{\alpha })|^2< \varepsilon ^2, \qquad 1-|u_s(S_{\alpha '})|^2< \varepsilon ^2, \end{aligned}$$

(117)

see (39). Fix these s, t and denote

$$\begin{aligned} {\tilde{X}}=s(g+N^{-1/2}S_{\alpha '})-t(g+N^{-1/2}S_{\alpha }). \end{aligned}$$

We are going to apply the inequality (256),

$$\begin{aligned} 1-|\mathbf{E}e^{i(Y+Z)}|^2 \ge 2^{-1}(1-|\mathbf{E}e^{iZ}|^2) - (1-|\mathbf{E}e^{iY}|^2) \end{aligned}$$

to $Z=-{\tilde{X}}$ and $Y=s(g+N^{-1/2}S_{\alpha '})$. It follows from this inequality and (117) that

$$\begin{aligned} \varepsilon ^2 > 1-|u_t(S_{\alpha })|^2 = 1-|\mathbf{E}e^{i(Y+Z)}|^2 \ge 2^{-1}(1-|\mathbf{E}e^{-i{\tilde{X}}}|^2)-\varepsilon ^2. \end{aligned}$$

In view of the identity $|\mathbf{E}e^{-i{\tilde{X}}}|=|\mathbf{E}e^{i{\tilde{X}}}|$ we have

$$\begin{aligned} 1-|\mathbf{E}e^{i{\tilde{X}}}|^2< 4\varepsilon ^2. \end{aligned}$$

(118)

Step 2. Here we shall show that (118) contradicts the second inequality of (115). Firstly, we collect some auxiliary inequalities. Write the decomposition (31) for $S_{\alpha }$ and $S_{\alpha '}$,

$$\begin{aligned} S_{\alpha }=a\,g+S_{\alpha }^*, \qquad S_{\alpha '}=a'\,g+S_{\alpha '}^*. \end{aligned}$$

(119)

Decompose

$$\begin{aligned}&{\tilde{X}} =vg+h, \\&v=(s-t)(1+a\,N^{-1/2})+(a'-a)sN^{-1/2}, \\&h= (s-t)N^{-1/2}S_{\alpha }^*+sN^{-1/2}(S_{\alpha '}^*-S_{\alpha }^*), \end{aligned}$$

where $v\in {\mathbb R}$ and where $h\in L^r$ is $L^2$-orthogonal to g. An application of (34) to $S_{\alpha }^*$ and $S_{\alpha '}^*-S_{\alpha }^*$ gives

$$\begin{aligned} \Vert h\Vert _r \le c_g N^{-1/2} \bigl ( |s|\, \Vert S_{\alpha '}-S_{\alpha }\Vert _r+|s-t|\,\Vert S_{\alpha }\Vert _r \bigr ). \end{aligned}$$

(120)

Furthermore, it follows from the simple inequality

$$\begin{aligned} \Vert x+y\Vert _2^2\ge 2^{-1}\Vert x\Vert _2^2-\Vert y\Vert _2^2 \end{aligned}$$

that

$$\begin{aligned} \Vert h\Vert _2^2 \ge 2^{-1}s^{2}N^{-1}\Vert S_{\alpha '}^*-S_{\alpha }^*\Vert _2^2 - (s-t)^2N^{-1}\Vert S_{\alpha }^*\Vert _2^2. \end{aligned}$$

(121)

Note that for a and $a'$ defined in (119) we obtain from (33) and (115) that

$$\begin{aligned}&|a| \le \Vert S_{\alpha }\Vert _r \Vert g\Vert _2^{-1} \le N^{\nu }\Vert g\Vert _2^{-1}, \end{aligned}$$

(122)

$$\begin{aligned}&|a'-a| \le \Vert S_{\alpha '}-S_{\alpha }\Vert _r \Vert g\Vert _2^{-1} \le 2 \delta _2 \varepsilon ^{2/r} N^{1/2}T_0^{-1}\Vert g\Vert _2^{-1}. \end{aligned}$$

(123)

Step 4.2.1. Consider the case where, $ |s-t|<\delta _2$. Invoking the inequalities $\Vert S_{\alpha }\Vert _r\le N^{\nu }$ and (115) we obtain from (120) that

$$\begin{aligned} \Vert h\Vert _r^r \le (4c_g)^r \delta _2^r\, \Bigl ( N^{\nu r-r/2}+\varepsilon ^2|s|^rT_0^{-r} \Bigr ). \end{aligned}$$

Furthermore, using (116), (94), and $N^{\nu -1/2}\le \varepsilon $, we obtain for $4\le r\le 5$

$$\begin{aligned} \Vert h\Vert _r^r\le 3^{-r}(\varepsilon ^r+\varepsilon ^2(11/10)^r) \le 3^{1-r}\varepsilon ^2. \end{aligned}$$

(124)

Note that (32) implies $\Vert S_{\alpha }^*\Vert _2\le \Vert S_{\alpha }\Vert _r\le N^{\nu }$. This inequality in combination with (115) and (121) gives

$$\begin{aligned} \Vert h\Vert _2^2 \ge 2^{-1}(s/T_0)^2 c_0^2 \varepsilon ^2-\delta _2^2N^{2\nu -1}. \end{aligned}$$

Invoking (116) and using $c_0>10$, $\delta _2<12^{-1}$, and $N^{\nu -1/2}\le \varepsilon $ we obtain

$$\begin{aligned} \Vert h\Vert _2^2\ge (4/10)c_0^2\varepsilon ^2. \end{aligned}$$

(125)

Now we are going to apply Lemma 12 statement a) to ${\tilde{X}}=vg+h$. For this purpose we verify the conditions of this lemma. Firstly, note that (125), (113) imply, $\Vert h_s\Vert _2^2\ge (8/10)c_0^2\varepsilon ^2$. Furthermore, it follows from the simple inequality $\mathbf{E}|h(X_1)-h(X_2)|^r\le 2^r\mathbf{E}|h(X_1)|^r$ and (124) that $\Vert h_s\Vert _r^r\le 3(2/3)^r\varepsilon ^2$. Therefore, we obtain, for $4\le r\le 5$,

$$\begin{aligned} \Vert h_s\Vert _r^r \le \frac{6}{10}\varepsilon ^2 \le c_0^{-2}\Vert h_s\Vert _2^2\le c_r \Vert h_s\Vert _2^2, \qquad c_r=(7/24)2^{-(r-1)}. \end{aligned}$$

Furthermore, the inequalities (122), (123) and (116) imply

$$\begin{aligned} |v| \le \delta _2+\delta _2\Vert g\Vert _2^{-1}(N^{\nu -1/2} + 2\varepsilon ^{2/r}(11/10)) \le \delta _2(1+4\Vert g\Vert _2^{-1}), \end{aligned}$$

for $N^{\nu -1/2}\le \varepsilon \le 1$. Invoking (94) and using the inequality $\Vert g_s\Vert _r^r\le 2^r\Vert g\Vert _r^r$ and the identity $\Vert g_s\Vert _2^2=2\Vert g\Vert _2^2$ we obtain

$$\begin{aligned} |v|^{r-2} \le \frac{c_r}{2^r}\frac{\Vert g\Vert _2^2}{\Vert g\Vert _r^{r}} \le \frac{c_r}{2^r}\frac{2^{-1}\Vert g_s\Vert _2^2}{2^{-r}\Vert g_s\Vert _r^r} \le \frac{c_r}{2}\frac{\Vert g_s\Vert _2^2}{\Vert g_s\Vert _r^r} \end{aligned}$$

as required by Lemma 12a). This lemma implies

$$\begin{aligned} 1-|\mathbf{E}e^{i{\tilde{X}}}|^2 \ge 6^{-1}\Vert h_s\Vert _2^2=3^{-1}\Vert h\Vert _2^2. \end{aligned}$$

In the last step we used (113). Now (125), for $c_0\ge 10$, contradicts (118).

Step 4.2.2. Consider the case where $\delta _2<|s-t|\le \delta _1 N^{-\nu +1/2}$. It follows from (120), (115) and (116) that

$$\begin{aligned} \mathbf{E}|h| \le \Vert h\Vert _r\le & {} c_g \bigl (2\delta _2\varepsilon ^{2/r}|s/T_0|+\delta _1\bigr ) \nonumber \\\le & {} c_g(\delta _1+3\delta _2\varepsilon ^{2/r}) \le c_g\delta _1+\varepsilon ^{2/r}. \end{aligned}$$

(126)

In the last step we used $\delta _2<1/3$. From (122), (123) and (116), we obtain for $\delta _2\le |s-t|$ and $N^{\nu -1/2}\le \varepsilon $,

$$\begin{aligned} |v|&\ge \delta _2 (1-N^{\nu -1/2}\Vert g\Vert _2^{-1}) - 2\delta _2\varepsilon ^{2/r}|s/T_0|\Vert g\Vert _2^{-1} \\&= \delta _2(1-\Vert g\Vert _2^{-1}(\varepsilon +\varepsilon ^{2/r}(22/10))) \\&\ge \delta _2(1-3\varepsilon ^{2/r}\Vert g\Vert _2^{-1})\ge \delta _2/2, \end{aligned}$$

provided that $\varepsilon ^{2/r}<\Vert g\Vert _2/6$. Similarly, using in addition, $\delta _1, \delta _2<1/4$ and $\varepsilon <\Vert g\Vert _2$, we obtain, for $|s-t|\le \delta _1N^{-\nu +1/2}$,

$$\begin{aligned} |v|&\le |s-t|(1+N^{\nu -1/2}\Vert g\Vert _2^{-1}) + 2\delta _2\varepsilon ^{2/r}|s/T_0|\Vert g\Vert _2^{-1} \\&\le |s-t|(1+\varepsilon \Vert g\Vert _2^{-1})+(22/10)\delta _2\varepsilon ^{2/r}\Vert g\Vert _2^{-1} \\&\le 2\,|s-t|+1 \le N^{-\nu +1/2}. \end{aligned}$$

It follows from these inequalities, see (95), that

$$\begin{aligned} 1-|\mathbf{E}e^{i{\tilde{X}}}|^2 \ge 1-|\mathbf{E}e^{i{\tilde{X}}}| \ge 1-|\mathbf{E}e^{ivg}|-\mathbf{E}|h| \ge \delta '-\mathbf{E}|h|. \end{aligned}$$

Finally, invoking (126) and (37), we get

$$\begin{aligned} 1-|\mathbf{E}e^{i{\tilde{X}}}|^2\ge \delta ' -c_g\delta _1-\varepsilon ^{2/r} \ge \delta '/2 > 4\varepsilon ^2, \end{aligned}$$

Once again we obtain a contradiction to (118), thus completing the proof.$\square $

5 Expansions

Here we prove the bound

$$\begin{aligned} \int _{|t|\le t_1}\Bigl |\mathbf{E}e^{it{\tilde{\mathbb T}}}-{\hat{G}}(t)\Bigr |\frac{dt}{|t|} \le c_*N^{-1-\nu }, \end{aligned}$$

(127)

where $t_1=N^{1/2}/10^3\beta _3$. For the definition of ${\tilde{\mathbb T}}$ and ${\hat{G}}$ see Sect. 2.4. Here and below $c_*$ denotes a constant depending on $A_*,M_*,D_*,r,s,\nu _1$ only. We prove (127) for sufficiently large N, that is, we shall assume that $N>C_*$, where $C_*$ is a number depending on $A_*,M_*,D_*,r,s,\nu _1$ only. Note that for $N<C_*$, the bound (127) becomes trivial, since in this case the integral is bounded by a constant.

Let us first introduce some notation. Denote $\Omega _m=\{1,\dots , m\}$. For $A\subset \Omega _N$ write ${\mathbb U}_1(A)=\sum _{j\in A}g_1(X_j)$. Given complex valued functions f, h we write $f\prec \mathcal R$ if

$$\begin{aligned} \int _{|t|\le t_1}|t^{-1}f(t)|dt\le c_*N^{-1-\nu } \end{aligned}$$

and write $f\sim h$ if $f-h\prec \mathcal R$. In particular, (127) can be written in short $\mathbf{E}e^{it{\tilde{\mathbb T}}}\sim {\hat{G}}(t)$.

In order to prove (127) we show that

$$\begin{aligned} \mathbf{E}e^{it{\tilde{\mathbb T}}}\sim \mathbf{E}e^{it{\mathbb T}} \qquad {\text {and}} \qquad \mathbf{E}e^{it{\mathbb T}}\sim {\hat{G}}(t). \end{aligned}$$

(128)

In what follows we use the notation of Sect. 2. We denote $\alpha (t)=\mathbf{E}e^{itg(X_1)}$. We assume that (16) holds.

5.1 Proof of the first relation of (128)

We have, see (19),

$$\begin{aligned} {\mathbb T}={\tilde{\mathbb T}}+{\tilde{\Lambda }}_1+{\tilde{\Lambda }}_2, \qquad {\tilde{\Lambda }}_1=\Lambda _1+\Lambda _4, \qquad {\tilde{\Lambda }}_2=\Lambda _2+\Lambda _3+\Lambda _5, \end{aligned}$$

where the random variables $\Lambda _j$ are introduced in Sect. 2.4. We shall show that

$$\begin{aligned} \mathbf{E}e^{it{\tilde{\mathbb T}}} \sim \mathbf{E}e^{it({\tilde{\mathbb T}}+{\tilde{\Lambda }}_1)} \qquad {\text {and}} \qquad \mathbf{E}e^{it({\tilde{\mathbb T}}+{\tilde{\Lambda }}_1)} \sim \mathbf{E}e^{it {\mathbb T}}. \end{aligned}$$

(129)

The second relation follows from the moment bounds of Lemma 5 via Taylor expansion. We have

$$\begin{aligned} \mathbf{E}e^{it{\mathbb T}} = \mathbf{E}e^{it({\tilde{\mathbb T}}+{\tilde{\Lambda }}_1)} +R, \qquad |R|\le |t|\mathbf{E}|{\tilde{\Lambda }}_2|, \end{aligned}$$

By Lyapunov’s inequality,

$$\begin{aligned} \mathbf{E}|{\tilde{\Lambda }}_2| \le (\mathbf{E}\Lambda _2^2)^{1/2} + (\mathbf{E}\Lambda _3^2)^{1/2} + (\mathbf{E}\Lambda _5^2)^{1/2}. \end{aligned}$$

Invoking the moment bounds of Lemma 5 we obtain $|t|\mathbf{E}|{\tilde{\Lambda }}_2|\prec \mathcal R$, thus, proving the second part of (129).

In order to prove the first part we combine Taylor’s expansion with bounds for characteristic functions. Expanding the exponent we obtain

$$\begin{aligned} \mathbf{E}e^{it({\tilde{\mathbb T}}+{\tilde{\Lambda }}_1)} = \mathbf{E}e^{it{\tilde{\mathbb T}}} + it\mathbf{E}e^{it{\tilde{\mathbb T}}}{\tilde{\Lambda }}_1 + R, \qquad |R|\le t^2\mathbf{E}|{\tilde{\Lambda }}_1|^2. \end{aligned}$$

Invoking the identities

$$\begin{aligned} \mathbf{E}\Lambda _1^2=\left( {\begin{array}{c}m\\ 2\end{array}}\right) \frac{\gamma _2}{N^3}, \qquad \mathbf{E}\Lambda _4^2=m \left( {\begin{array}{c}N-m\\ 2\end{array}}\right) \frac{\zeta _2}{N^5} \end{aligned}$$

(130)

we obtain, for $\gamma _2<c_*$ and $\zeta _2<c_*$, see (5), and $m\le N^{1/12}$, that $R\prec \mathcal R$. We complete the proof of (129) by showing that

$$\begin{aligned} t\mathbf{E}e^{it{\tilde{\mathbb T}}}{\tilde{\Lambda }}_1\prec \mathcal R. \end{aligned}$$

(131)

Let us prove (131). Split ${\mathbb W}={\mathbb W}_1+{\mathbb W}_2+{\mathbb W}_3+R_{W}$, where

$$\begin{aligned} {\mathbb W}_k=\sum _{A\subset \Omega ',\,|A|=k}T_A, \qquad R_W=\sum _{A\subset \Omega ', \, |A|\ge 4}T_A. \end{aligned}$$

Here $\Omega '=\{m+1,\dots , N\}$. Denote ${\mathbb R}={\mathbb U}_2^*+{\mathbb W}_3+R_W$ and ${\mathbb U}_1=\sum _{j=1}^Ng_1(X_j)$. We have $ {\tilde{\mathbb T}}={\mathbb U}_1+{\mathbb W}_2+{\mathbb R}$. Expanding the exponent in powers of $it{\mathbb R}$ we obtain

$$\begin{aligned} t\mathbf{E}e^{it{\tilde{\mathbb T}}}{\tilde{\Lambda }}_1 = t\mathbf{E}e^{it({\mathbb U}_1+{\mathbb W}_2)}{\tilde{\Lambda }}_1+t^2R, \end{aligned}$$

(132)

where

$$\begin{aligned}&|R| \le \mathbf{E}|{\tilde{\Lambda }}_1{\mathbb R}| \le (r_1+r_2)(r_3+r_4+r_5),\\&r_1^2=\mathbf{E}\Lambda _1^2, \quad r_2^2=\mathbf{E}\Lambda _4^2, \quad r_3^2=\mathbf{E}({\mathbb U}_2^*)^2, \quad r_4^2=\mathbf{E}R_W^2, \quad r_5^2=\mathbf{E}{\mathbb W}_3^2. \end{aligned}$$

In the last step we applied the Cauchy–Schwartz inequality. Combining (130) with the identities

$$\begin{aligned} \mathbf{E}({\mathbb U}_2^*)^2 = \frac{m(N-m)}{N^3}\gamma _2, \qquad \mathbf{E}{\mathbb W}_3^2=\frac{\left( {\begin{array}{c}N-m\\ 3\end{array}}\right) }{N^5}\zeta _2 \end{aligned}$$

and invoking the simple bound

$$\begin{aligned} \mathbf{E}R_W^2\le \frac{\Delta _4^2}{N^3}\le \frac{D_*}{N^{2+2\nu _1}}, \end{aligned}$$

we obtain $t^2(r_1+r_2)(r_3+r_4+r_5)\prec \mathcal R$. Therefore, (132) implies

$$\begin{aligned} t\mathbf{E}e^{it{\tilde{\mathbb T}}}{\tilde{\Lambda }}_1 \sim t\mathbf{E}e^{it({\mathbb U}_1+{\mathbb W}_2)}{\tilde{\Lambda }}_1. \end{aligned}$$

Let us show that $t\mathbf{E}e^{it({\mathbb U}_1+{\mathbb W}_2)}{\tilde{\Lambda }}_1\sim 0$. Expanding the exponent in powers of $it{\mathbb W}_2$ we get

$$\begin{aligned}&t\mathbf{E}e^{it({\mathbb U}_1+{\mathbb W}_2)}{\tilde{\Lambda }}_1 = f_1(t)+f_2(t)+f_3(t)+f_4(t), \\&f_1(t)= t\mathbf{E}e^{it{\mathbb U}_1}{\tilde{\Lambda }}_1, \qquad \qquad \quad f_2(t)= it^2\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1{\mathbb W}_2, \\&f_3(t)= t^2\mathbf{E}e^{it{\mathbb U}_1} \Lambda _4{\mathbb W}_2\theta _1, \qquad \ f_4(t)= t^3\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1{\mathbb W}_2^2\theta _2/2, \end{aligned}$$

where $\theta _1, \theta _2$ are functions of ${\mathbb W}_2$ satisfying $|\theta _i|\le 1$.

Let us show that $f_i\prec \mathcal R$, for $i=1,2,3,4$. Split the set $\Omega _m=\{1,\dots , m\}$ in three (non-intersecting) parts $A_1\cup A_2\cup A_3=\Omega _m$ of (almost) equal size $|A_i|\approx m/3$. The set of pairs $\bigl \{\{i,j\}\subset \Omega _m\bigr \}$ splits into six (non-intersecting) parts $B_{kr}$, $1\le k\le r\le 3$ (the pair $\{i,j\}$ belongs to $B_{kr}$ if $i\in A_k$ and $j\in A_r$). Write

$$\begin{aligned}&\Lambda _1=\sum _{1\le k\le r\le 3}\Lambda _1(k,r), \qquad \Lambda _1(k,r)=\sum _{\{i,j\}\in B_{kr}} g_2(X_k,X_l), \\&\Lambda _4=\sum _{1\le k\le 3}\Lambda _4(k), \qquad \Lambda _4(k)=\sum _{i\in A_k}\sum _{m+1\le j<l\le N}g_3(X_i,X_j,X_l). \end{aligned}$$

Let us prove $f_4\prec \mathcal R$. We shall show that

$$\begin{aligned} t^3\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1(k,r){\mathbb W}^2_2\theta _2\prec \mathcal R. \end{aligned}$$

(133)

Given a pair (k, r) denote $A_i=\Omega _m\setminus (A_k\cup A_r)$ and write $k_i=|A_i|$. Note that $k_i\approx m/3$. We shall assume that $k_i\ge m/4$. Since the random variable ${\mathbb U}_1(A_i):=\sum _{j\in A_i}g_1(X_j)$ and the random variables $\Lambda _1(k,r)$, ${\mathbb W}_2$ are independent, we have

$$\begin{aligned} \mathbf{E}e^{it{\mathbb U}_1}\Lambda _1(k,r){\mathbb W}^2_2\theta _2 = \mathbf{E}e^{it{\mathbb U}_1(A_i)} \ \mathbf{E}\Lambda _1(k,r){\mathbb W}^2_2\theta _2. \end{aligned}$$

Therefore,

$$\begin{aligned} |\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1(k,r){\mathbb W}^2_2\theta _2| \le |\mathbf{E}e^{it{\mathbb U}_1(A_i)}| \ \mathbf{E}| \Lambda _1(k,r){\mathbb W}^2_2|. \end{aligned}$$

(134)

The first factor on the right is bounded from above by $\exp \{-mt^2/16N\}$, for $k_i\ge m/4$, see (165) below. The second factor is bounded from above by r, where

$$\begin{aligned} {r}^2=\mathbf{E}\Lambda _1^2(k,r)\mathbf{E}{\mathbb W}_2^4\le c_*m^2N^{-5}. \end{aligned}$$

Here we combined the Cauchy–Schwartz inequality and the bounds

$$\begin{aligned} \mathbf{E}\Lambda _1^2(k,r)\le c_*m^2N^{-3}, \qquad \mathbf{E}{\mathbb W}_2^4\le c_*N^{-2}. \end{aligned}$$

Finally, (133) follows from (134)

$$\begin{aligned} \bigl | t^3\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1(k,r){\mathbb W}_2^2\theta _2 \bigr | \le c_*|t|^3e^{-mt^2/16N}mN^{-5/2}\prec \mathcal R. \end{aligned}$$

The proof of $f_3\prec \mathcal R$ is almost the same as that of $f_4\prec \mathcal R$.

Let us prove $f_2\prec \mathcal R$. Split the set $\Omega '=\{m+1,\dots , N\}$ into three (non-intersecting) parts $B_1\cup B_2\cup B_3=\Omega '$ of (almost) equal sizes $|B_i|\approx (N-m)/3$. Split the set of pairs $\bigl \{ \{i,j\}:\, m+1\le i<j\le N\bigr \}$ into (non-intersecting) groups D(k, r), for $1\le k\le r\le 3$. The pair $\{i,j\}\in D(k,r)$ if $i\in B_k$ and $j\in B_r$. Write

$$\begin{aligned}&{\mathbb W}_2=\sum _{1\le k\le r\le 3}{\mathbb W}_2(k,r), \qquad {\mathbb W}_2(k,r)=\sum _{\{i,j\}\in D(k,r)}g_2(X_i,X_j). \\&\Lambda _4=\sum _{1\le k\le r\le 3}\Lambda _4(k,r), \qquad \Lambda _4(k,r)=\sum _{1\le s\le m}\sum _{\{i,j\}\in D(k,r)}g_3(X_s,X_i,X_j), \end{aligned}$$

In order to prove $f_2\prec \mathcal R$ we shall show that

$$\begin{aligned} t^2\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1{\mathbb W}_2(k,r)\prec \mathcal R. \end{aligned}$$

(135)

Write $B_i=\Omega '\setminus (B_k\cup B_r)$ and denote $m_i=|B_i|$. We shall assume that $m_i\ge N/4$. Since the random variable ${\mathbb U}_1(B_i)=\sum _{j\in B_i}g_1(X_j)$ and the random variables $\Lambda _1$ and ${\mathbb W}_2(k,r)$ are independent, we have, cf. (134),

$$\begin{aligned} |\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1{\mathbb W}_2(k,r)| \le |\mathbf{E}e^{it{\mathbb U}_1(B_i)}| \ \mathbf{E}|\Lambda _1{\mathbb W}_2(k,r)|. \end{aligned}$$

(136)

The first factor in the right is the product $|\alpha ^{m_i}(t)|\le e^{-m_it^2/4N}$, see the argument used in the proof of (133) above. The second factor is bounded from above by ${\tilde{r}}$, where

$$\begin{aligned} {\tilde{r}}^2=\mathbf{E}\Lambda _1^2\mathbf{E}{\mathbb W}_2^2(k,r)\le c_*m^2N^{-4}. \end{aligned}$$

Finally, we obtain, using the inequality $m_i\ge N/4$,

$$\begin{aligned} |\mathbf{E}e^{it{\mathbb U}_1}| \ \mathbf{E}|\Lambda _1{\mathbb W}_2(k,r)| \le c_*\frac{m}{N^2}\exp \{-t^2\frac{m_i}{4N}\} \le c_*\frac{m}{N^2}\exp \{-\frac{t^2}{16}\}. \end{aligned}$$

This in combination with (136) shows (135). We obtain $f_2\prec \mathcal R$.

Let us prove $f_1\prec \mathcal R$. We shall show that $f^*\prec \mathcal R$ and $f^{\star }\prec \mathcal R$, where

$$\begin{aligned} f^{\star }=t\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1 \qquad {\text {and}} \qquad f^*=t\mathbf{E}e^{it{\mathbb U}_1}\Lambda _4 \end{aligned}$$

satisfy $f^*+f^{\star }=f_1$.

Let us show $f^{\star }\prec \mathcal R$. Denote ${\mathbb U}_1^{\star }=\sum _{j=m+1}^Ng_1(X_j)$. We obtain, by the independence of ${\mathbb U}_1^{\star }$ and $\Lambda _1$ that

$$\begin{aligned} |\mathbf{E}e^{it{\mathbb U}_1}\Lambda _1| \le |\mathbf{E}e^{it{\mathbb U}_1^{\star }}| \ \mathbf{E}|\Lambda _1|. \end{aligned}$$

Invoking, for $N-m>N/2$, the bound $ |\mathbf{E}e^{it{\mathbb U}_1^{\star }}| \le e^{-t^2/8} $, see (165) below, and the bound $\mathbf{E}|\Lambda _1|\le (\mathbf{E}\Lambda _1^2)^{1/2}\le c_*mN^{-3/2}$ we obtain

$$\begin{aligned} |f^{\star }(t)|\le c_*|t|e^{-t^2/8}N^{-3/2}\prec \mathcal R. \end{aligned}$$

Let us prove $f^*\prec \mathcal R$. We shall show that, for $1\le k\le r\le 3$,

$$\begin{aligned} t\mathbf{E}e^{it{\mathbb U}_1}\Lambda _4(k,r)\prec \mathcal R. \end{aligned}$$

(137)

Proceeding as in the proof of (135) we obtain the chain of inequalities

$$\begin{aligned} |\mathbf{E}e^{it{\mathbb U}_1}\Lambda _4(k,r)| \le e^{-t^2/16}\mathbf{E}|\Lambda _4(k,r)| \le c_* e^{-t^2/16}m^{1/2}N^{-3/2}. \end{aligned}$$

(138)

In the last step we applied Cauchy–Schwartz and the simple bound $\mathbf{E}\Lambda _4^2(k,r)\le c_*mN^{-3}$. Clearly, (138) implies (137).

5.2 Proof of the second relation of (128)

Here we prove the second relation of (128). Firstly, we shall show that

$$\begin{aligned}&\mathbf{E}e^{it{\mathbb T}}\sim \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2+{\mathbb U}_3)\}, \end{aligned}$$

(139)

$$\begin{aligned}&\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2+{\mathbb U}_3)\} \sim \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}+\left( {\begin{array}{c}N\\ 3\end{array}}\right) e^{-t^2/2}(it)^4w, \end{aligned}$$

(140)

where $w=\mathbf{E}g_3(X_1,X_2,X_3)g_1(X_1)g_1(X_2)g_1(X_3)$.

Let m(t) be an integer valued function such that

$$\begin{aligned} m(t)\approx C_1Nt^{-2}\ln (t^2+1), \qquad C_1\le |t|\le t_1, \end{aligned}$$

(141)

and put $m(t)\equiv 10$, for $|t|\le C_1$. Here $C_1$ denotes a large absolute constant (one can take, e.g., $C_1=200$). Assume, in addition, that the numbers $m=m(t)$ are even.

5.2.1 Proof of (139)

Given m write

$$\begin{aligned} {\mathbb T}={\mathbb U}_1+{\mathbb U}_2+{\mathbb U}_3+{\mathbb H}, \end{aligned}$$

where

$$\begin{aligned} {\mathbb H}={\mathbb H}_1+{\mathbb H}_2, \qquad {\mathbb H}_1=\sum _{|A|\ge 4,\, A\cap \Omega _m=\emptyset }T_A, \qquad {\mathbb H}_2=\sum _{|A|\ge 4,\, A\cap \Omega _m\not =\emptyset }T_A. \end{aligned}$$

In order to show (139) we expand the exponent in powers of $it{\mathbb H}$ and $it{\mathbb U}_3$,

$$\begin{aligned} \mathbf{E}\exp \{it{\mathbb T}\} = \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2+{\mathbb U}_3)\} + \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}it{\mathbb H} + R, \end{aligned}$$

where $ |R|\le t^2(\mathbf{E}{\mathbb H}^2+\mathbf{E}|{\mathbb U}_3{\mathbb H}|) $. Invoking the bounds, see (166), (167), (5), (6),

$$\begin{aligned} \mathbf{E}{\mathbb H}^2\le N^{-3}\Delta _4^2\le c_*N^{-2-2\nu _1}, \qquad \mathbf{E}{\mathbb U}_3^2\le N^{-2}\zeta _2\le c_*N^{-2} \end{aligned}$$

(142)

we obtain, by Cauchy–Schwartz, $|R|\le c_*t^2 N^{-2-\nu _1}\prec \mathcal R$. We complete the proof of (139) by showing that

$$\begin{aligned} \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}it{\mathbb H}\prec \mathcal R. \end{aligned}$$

(143)

Before proving (143) we collect some auxiliary inequalities. For $m=2k$ write

$$\begin{aligned} \Omega _m=A_1\cup A_2, {\text { where}} \qquad A_1=\{1,\dots , k\}, \qquad A_2=\{k+1,\dots , 2k\}. \end{aligned}$$

(144)

Furthermore, split the sum

$$\begin{aligned} {\mathbb U}_2= & {} {\mathbb Z}_1+{\mathbb Z}_2+{\mathbb Z}_3+{\mathbb Z}_4, \nonumber \\ {\mathbb Z}_1= & {} \sum _{1\le i<j\le m}g_2(X_i,X_j), \qquad {\mathbb Z}_2 = \sum _{i\in A_1}\sum _{m<j\le N}g_2(X_i,X_j), \nonumber \\ {\mathbb Z}_3= & {} \sum _{i\in A_2}\sum _{m<j\le N}g_2(X_i,X_j), \qquad {\mathbb Z}_4=\sum _{m<i<j\le N}g_2(X_i,X_j). \end{aligned}$$

(145)

In what follows we shall use the simple bounds, see (5),

$$\begin{aligned}&\mathbf{E}{\mathbb Z}_1^2\le \frac{m^2}{N^3}\gamma _2\le c_*\frac{m^2}{N^3}, \qquad \mathbf{E}{\mathbb Z}_4^2\le \frac{\gamma _2}{N}\le \frac{c_*}{N}, \nonumber \\&\mathbf{E}{\mathbb Z}_i^2\le \frac{m}{N^2}\gamma _2\le c_*\frac{m}{N^2}, \quad \mathbf{E}{\mathbb Z}_i^4\le c\frac{m^2}{N^{4}}\gamma _4\le c_*\frac{m^2}{N^4}, \quad i=2,3. \end{aligned}$$

(146)

Let us prove (143). Expand the exponent $\exp \{it({\mathbb U}_1+{\mathbb Z}_1+\dots +{\mathbb Z}_4)\}$ in powers of $it{\mathbb Z}_1$ to get

$$\begin{aligned} \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}it{\mathbb H} = h_1(t)+R, \end{aligned}$$

where $h_1(t)=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_2+\dots +{\mathbb Z}_4)\}it{\mathbb H}$ and where

$$\begin{aligned} |R| \le t^2\mathbf{E}|{\mathbb H}{\mathbb Z}_1| \le t^2(\mathbf{E}{\mathbb H}^2)^{1/2}(\mathbf{E}{\mathbb Z}_1^2)^{1/2} \le c_*t^2mN^{-(5+2\nu _1)/2}. \end{aligned}$$

For $m=m(t)$ satisfying (141) we have $R\prec \mathcal R$. Therefore, we obtain

$$\begin{aligned} \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}it{\mathbb H}\sim h_1. \end{aligned}$$

In order to prove $h_1\prec \mathcal R$ we write $h_1=h_2+h_3$ and show that $h_2, h_3\prec \mathcal R$ , where

$$\begin{aligned}&h_2=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_2+\dots +{\mathbb Z}_4)\}it{\mathbb H}_1, \qquad \\&h_3=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_2+\dots +{\mathbb Z}_4)\}it{\mathbb H}_2. \end{aligned}$$

Let us show that $h_2\prec \mathcal R$. Firstly, we prove that

$$\begin{aligned} h_2\sim h_{2.1}+h_{2.2}+h_{2.3}, \end{aligned}$$

(147)

where $h_{2.1}(t)=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_4)\}it{\mathbb H}_1$ and, for $j=2,3$,

$$\begin{aligned} h_{2.j}(t)=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_4)\}(it)^2{\mathbb H}_1{\mathbb Z}_j. \end{aligned}$$

Expanding the exponent in powers of $it({\mathbb Z}_2+{\mathbb Z}_3)$ we obtain

$$\begin{aligned} h_2=h_{2.1}+h_{2.2}+h_{2.3}+R, \end{aligned}$$

where $|R|\le |t|^3\mathbf{E}|{\mathbb H}_1|({\mathbb Z}_2+{\mathbb Z}_3)^2$ is bounded from above by

$$\begin{aligned} |t|^3 (\mathbf{E}{\mathbb H}_1^2)^{1/2}(\mathbf{E}({\mathbb Z}_2+{\mathbb Z}_3)^4)^{1/2} \le c_*|t|^3mN^{-3-\nu _1} \prec \mathcal R. \end{aligned}$$

In the last step we used $\mathbf{E}{\mathbb H}_1^2\le \mathbf{E}{\mathbb H}^2$ and applied (142) and (146). Therefore, (147) follows.

Let us show $h_{2.i}\prec \mathcal R$, for $i=1,\, 2,\,3$. The random variable ${\mathbb U}_1(A_1)$ does not depend on the observations $X_j$, $j\in \Omega \setminus A_1$. Therefore, we can write

$$\begin{aligned} h_{2.3} = \mathbf{E}\exp \{it{\mathbb U}_1(A_1)\} \mathbf{E}\exp \{it({\mathbb U}_1(\Omega \setminus A_1)+{\mathbb Z}_4)\}(it)^2 {\mathbb H}_1{\mathbb Z}_3. \end{aligned}$$

Furthermore, using (165) we obtain, for $|A_1|=m/2$,

$$\begin{aligned} |h_{2.3}|\le t^2|\alpha ^{m/2}(t)|\mathbf{E}|{\mathbb H}_1{\mathbb Z}_3| \le c_*t^2\exp \{-t^2\frac{m}{8N}\}\frac{m^{1/2}}{N^{2+\nu _1}}. \end{aligned}$$

(148)

In the last step we combined the bound $\mathbf{E}{\mathbb H}_1^2\le c_*N^{-2-2\nu _1}$ and (146) to get

$$\begin{aligned} \mathbf{E}|{\mathbb H}_1{\mathbb Z}_3| \le (\mathbf{E}{\mathbb H}_1^2)^{1/2}(\mathbf{E}{\mathbb Z}_3^2)^{1/2} \le c_*m^{1/2}N^{-2-\nu _1}. \end{aligned}$$

Note that choosing of $C_1$ in (141) sufficiently large implies, for $|t|\ge C_1$,

$$\begin{aligned} t^2m/12N\approx (C_1/12)\ln (t^2+1)\ge 10 \ln (t^2+1). \end{aligned}$$

An application of this bound to the argument of the exponent in (148) shows $h_{2.3}\prec \mathcal R$. The proof of $h_{2.i}\prec \mathcal R$, for $i=1,2$, is almost the same. Therefore, we obtain $h_2\prec \mathcal R$.

Let us prove $h_3\prec \mathcal R$. Firstly we collect some auxiliary inequalities. Write $m=2k$ (recall that the number m is even) and split $\Omega _m=B \cup D$, where B denotes the set of odd numbers and D denotes the set of even numbers. Split ${\mathbb H}_2={\mathbb H}_B+{\mathbb H}_D+{\mathbb H}_C$. Here, for $A\subset \Omega _N$ and $|A|\ge 4$, we denote by ${\mathbb H}_B$ the sum of $T_A$ such that $A\cap B=\emptyset $ and $A\cap D\not =\emptyset $; ${\mathbb H}_D$ denotes the sum of $T_A$ such that $A\cap B\not =\emptyset $ and $A\cap D=\emptyset $; ${\mathbb H}_C$ denotes the sum of $T_A$ such that $A\cap B\not =\emptyset $ and $A\cap D\not =\emptyset $. It follows from the inequalities (177) and (6) that

$$\begin{aligned} \mathbf{E}\,{\mathbb H}_C^2\le c_*m^2N^{-4-2\nu _1}, \qquad \mathbf{E}\,{\mathbb H}_B^2=\mathbf{E}\,{\mathbb H}_D^2\le c_*mN^{-3-2\nu _1}. \end{aligned}$$

(149)

Using the notation $z=it\exp \{it({\mathbb U}_1+{\mathbb Z}_2+{\mathbb Z}_3+{\mathbb Z}_4)\}$ write

$$\begin{aligned}&h_3=\mathbf{E}z{\mathbb H}_2=h_{3.1}+h_{3.2}+h_{3.3}, \\&h_{3.1}=\mathbf{E}z{\mathbb H}_B, \quad h_{3.2}=\mathbf{E}z{\mathbb H}_D, \quad h_{3.3}=\mathbf{E}z{\mathbb H}_C. \end{aligned}$$

We shall show that $h_{3.i}\prec \mathcal R$, for $i=1,2,3$. The relation $h_{3.3}\prec \mathcal R$ follows from (149) and (146), and by Cauchy–Schwartz, $ |h_{3.3}|\le c_*|t|\, mN^{-2-\nu _1}\prec \mathcal R$.

Let us show that $h_{3.2}\prec \mathcal R$. Expanding the exponent in powers of $it({\mathbb Z}_2+{\mathbb Z}_3)$ we obtain

$$\begin{aligned} h_{3.2}=h_{3.2}^*+R, \qquad h_{3.2}^*:=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_4)\}it{\mathbb H}_D, \end{aligned}$$

where $|R|\le t^2\mathbf{E}|{\mathbb H}_D ({\mathbb Z}_2+{\mathbb Z}_3)|$. Combining the bounds (146) and (149) we obtain, by Cauchy–Schwartz, $|R|\le c_*t^2mN^{-(5+2\nu _1)/2}\prec \mathcal R$. Next we show that $h_{3.2}^*\prec \mathcal R$. The random variable ${\mathbb U}_1(D)=\sum _{j\in D}g_1(X_j)$ and the random variable ${\mathbb H}_D$ are independent. Therefore, we can write

$$\begin{aligned} |h_{3.2}^*|\le |t|\,|\mathbf{E}\exp \{it{\mathbb U}_1(D)\}|\mathbf{E}|{\mathbb H}_D|. \end{aligned}$$

Combining (165) and (149) we obtain using Cauchy–Schwartz,

$$\begin{aligned} |h_{3.2}^*| \le c_*|t|\,e^{-mt^2/8N}m^{1/2}N^{-(3+2\nu _1)/2}\prec \mathcal R. \end{aligned}$$

The proof of $h_{3.1}\prec \mathcal R$ is similar. Therefore, we obtain $h_3\prec \mathcal R$. This together with the relation $h_2\prec \mathcal R$, proved above, implies $h_1\prec \mathcal R$. Thus we arrive at (143) completing the proof of (139).

5.2.2 Proof of (140)

We start with some auxiliary moment inequalities. Split

$$\begin{aligned} {\mathbb U}_3=W+Z, \qquad W=\sum _{|A|=3,\, A\cap \Omega _m\not =\emptyset }T_A, \qquad Z=\sum _{|A|=3,\, A\cap \Omega _m=\emptyset }T_A. \end{aligned}$$

Using the orthogonality and moment bounds for U-statistics, see, e.g., Dharmadhikari et al. [20], one can show that

$$\begin{aligned} \mathbf{E}W^2\le mN^2\mathbf{E}g_3^2(X_1,X_2,X_3), \qquad \mathbf{E}Z^2\le N^3\mathbf{E}g_3^2(X_1,X_2,X_3), \end{aligned}$$

and $\mathbf{E}|Z|^s\le cN^{3s/2}\mathbf{E}|g_3(X_1,X_2,X_3)|^s$. Invoking (5) we obtain

$$\begin{aligned} \mathbf{E}W^2\le c_*mN^{-3}, \qquad \mathbf{E}Z^2\le c_*N^{-2}, \qquad \mathbf{E}|Z|^s\le c_*N^{-s}. \end{aligned}$$

(150)

For the sets $A_1,A_2\subset \Omega _m$ defined in (144) write

$$\begin{aligned}&\mathcal{D}=\{A\subset {\Omega }_N:\, |A|=3,\, A\cap {\Omega }_m\not =\emptyset \}, \\&\mathcal{D}_1=\{A\in \mathcal{D}:\, A\cap A_1=\emptyset \}, \\&\mathcal{D}_2=\{A\in \mathcal{D}:\, A\cap A_2=\emptyset \}, \\&\mathcal{D}_3=\{A\in \mathcal{D}:\, A\cap A_1\not =\emptyset , \ A\cap A_2\not =\emptyset \}. \end{aligned}$$

We have $\mathcal{D}=\mathcal{D}_1\cup \mathcal{D}_2\cup \mathcal{D}_3$ and $W=\sum _{A\in \mathcal{D}}T_A$. Therefore, we can write $W=W_1+W_2+W_3$, where $W_j=\sum _{A\in \mathcal{D}_j}T_A$.

A calculation shows that

$$\begin{aligned} \mathbf{E}W_1^2=\mathbf{E}W_2^2\le kN^{2}\mathbf{E}g_3^2(X_1,X_2,X_3), \qquad \mathbf{E}W_3^2\le k^2N\mathbf{E}g_3^2(X_1,X_2,X_3). \end{aligned}$$

Therefore, we obtain form (5) that

$$\begin{aligned} \mathbf{E}W_1^2=\mathbf{E}W_2^2\le c_*mN^{-3}, \qquad \mathbf{E}W_3^2\le c_*m^2N^{-4}. \end{aligned}$$

(151)

Let us prove (140). Write ${\mathbb U}_3=W+Z$. Expanding the exponent in powers of itW we obtain

$$\begin{aligned}&\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2+{\mathbb U}_3)\}=h_4+h_5+R, \\&h_4 =\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2+Z)\}, \\&h_5=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2+Z)\}itW, \end{aligned}$$

where, by (150), $|R|\le t^2\mathbf{E}W^2\le c_*t^2mN^{-3}\prec \mathcal R$. This implies

$$\begin{aligned} \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2+{\mathbb U}_3)\}\sim h_4+h_5. \end{aligned}$$

In order to prove (140) we shall show that

$$\begin{aligned}&h_5\sim \mathbf{E}\exp \{it{\mathbb U}_1\}itW, \end{aligned}$$

(152)

$$\begin{aligned}&h_4 \sim \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\} + \mathbf{E}\exp \{it{\mathbb U}_1\}itZ, \end{aligned}$$

(153)

$$\begin{aligned}&\mathbf{E}\exp \{it{\mathbb U}_1\}it{\mathbb U}_3\sim \left( {\begin{array}{c}N\\ 3\end{array}}\right) e^{-t^2/2}(it)^4w. \end{aligned}$$

(154)

Let us prove (152). Expanding the exponent (in $h_5$) in powers of itZ we obtain

$$\begin{aligned} h_5 = h_6+R, \qquad h_6 = \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}itW, \end{aligned}$$

where, by (150) and Cauchy–Schwartz,

$$\begin{aligned} |R| \le t^2\mathbf{E}|WZ|\le c_*t^2m^{1/2}N^{-5/2}\prec \mathcal R. \end{aligned}$$

We have, $h_5\sim h_6$.

It remains to show that $h_6\sim \mathbf{E}\exp \{it{\mathbb U}_1\}itW$. Split

$$\begin{aligned} {\mathbb U}_2={\mathbb U}_2^*+{\mathbb U}_2^{\star }, \qquad {\mathbb U}_2^*=\sum _{|A|=2,\, A\cap \Omega _m\not =\emptyset }T_A, \qquad {\mathbb U}_2^{\star }=\sum _{|A|=2,\, A\cap \Omega _m=\emptyset }T_A. \end{aligned}$$

(155)

We have, see (146),

$$\begin{aligned} \mathbf{E}({\mathbb U}_2^*)^2\le c_*mN^{-2}, \qquad \mathbf{E}({\mathbb U}_2^{\star })^2\le c_*N^{-1}. \end{aligned}$$

(156)

Expanding the exponent (in $h_6$) in powers of $it{\mathbb U}_2^*$ we obtain

$$\begin{aligned} h_6=h_7+R, \qquad {\text {where}} \qquad h_7=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2^{\star })\}itW, \end{aligned}$$

and where, by (150), (156) and Cauchy–Schwartz,

$$\begin{aligned} |R| \le t^2\mathbf{E}|W{\mathbb U}_2^*| \le c_*t^2mN^{-5/2} \prec \mathcal R. \end{aligned}$$

Therefore, we obtain $h_6\sim h_7$.

We complete the proof of (152) by showing that $h_7\sim \mathbf{E}\exp \{it{\mathbb U}_1\}itW$. Use the decomposition $W=W_1+W_2+W_3$ and write

$$\begin{aligned} h_7=h_{7.1}+h_{7.2}+h_{7.3}, \qquad h_{7.j}=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2^{\star })\}itW_j. \end{aligned}$$

We shall show that

$$\begin{aligned} h_{7.j}\sim \mathbf{E}\exp \{it{\mathbb U}_1\}itW_j, \qquad j=1,2,3. \end{aligned}$$

(157)

Expanding in powers of $it{\mathbb U}_2^{\star }$ we obtain

$$\begin{aligned} h_{7.j}= \mathbf{E}\exp \{it{\mathbb U}_1\}itW_j+R_j, \end{aligned}$$

where $R_j=(it)^2\mathbf{E}\exp \{it{\mathbb U}_1\} W_j{\mathbb U}_2^{\star }\theta $ and where $\theta $ is a function of ${\mathbb U}_2^{\star }$ satisfying $|\theta |\le 1$. In order to prove (157) we show that $R_j\prec \mathcal R$, for $j=1,2,3$.

Combining (151) and (156) we obtain via Cauchy–Schwartz

$$\begin{aligned} |R_3|\le c_*t^2mN^{-5/2}\prec \mathcal R. \end{aligned}$$

Furthermore, using the fact that the random variable ${\mathbb U}_1(A_2)$ and the random variables ${\mathbb U}_2^{\star }$ and $W_2$ are independent, we can write

$$\begin{aligned} |R_2| \le t^2|\mathbf{E}\exp \{it{\mathbb U}_1(A_2)\}|\mathbf{E}|W_2{\mathbb U}_2^{\star }| \le c_*t^2e^{-mt^2/8N}m^{1/2}N^{-2} \prec \mathcal R. \end{aligned}$$

Here we used (165) and the moment inequalities (151) and (156). The proof of $R_1\prec \mathcal R$ is similar. We arrive at (157) and, thus, complete the proof of (152).

Let us prove (153). We proceed in two steps. Firstly we show

$$\begin{aligned}&h_4\sim h_8+h_9, \nonumber \\&h_8=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}, \qquad h_9=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}itZ. \end{aligned}$$

(158)

Secondly, we show

$$\begin{aligned} h_9\sim \mathbf{E}\exp \{it{\mathbb U}_1\}itZ. \end{aligned}$$

(159)

In order to prove (158) we write

$$\begin{aligned} h_4=h_8+h_9+R, \qquad R=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}{\tilde{r}}, \qquad {\tilde{r}}=\exp \{itZ\}-1-itZ, \end{aligned}$$

and show that $R\prec \mathcal R$. In order to bound the remainder R we write ${\mathbb U}_2={\mathbb U}_2^*+{\mathbb U}_2^{\star }$, see (155), and expand the exponent in powers of $it{\mathbb U}_2^*$. We obtain $R=R_1+R_2$, where

$$\begin{aligned} R_1=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2^{\star })\}{\tilde{r}} \qquad {\text {and}} \qquad |R_2|\le \mathbf{E}|it{\mathbb U}^*_2{\tilde{r}}|. \end{aligned}$$

Note that, for $2<s\le 3$, we have $|{\tilde{r}}|\le c |tZ|^{s/2}$. Combining (150) and (156) we obtain via Cauchy–Schwartz,

$$\begin{aligned} |R_2| \le |t|^{1+s/2}\mathbf{E}|Z|^{s/2}|{\mathbb U}_2^*| \le c_*|t|^{1+s/2}m^{1/2}N^{-1-s/2} \prec \mathcal R. \end{aligned}$$

In order to prove $R_1\prec \mathcal R$ we use the fact that the random variable ${\mathbb U}_1(\Omega _m)$ and the random variables ${\mathbb U}_2^{\star }$ and ${\tilde{r}}$ are independent. Invoking the inequality $|{\tilde{r}}|\le t^2Z^2$ we obtain from (165) and (150)

$$\begin{aligned} |R_1|\le t^2|\alpha ^m(t)|\mathbf{E}Z^2\le c_* t^2e^{-mt^2/4N}N^{-2}\prec \mathcal R. \end{aligned}$$

We thus arrive at (158).

Let us prove (159). Use the decomposition (145) and expand the exponent (in $h_9$) in powers of $it{\mathbb Z}_1$ to get $h_9=h_{10}+R$, where

$$\begin{aligned} h_{10}=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_2+{\mathbb Z}_3+{\mathbb Z}_4)\}itZ, \quad |R|\le t^2\mathbf{E}|Z{\mathbb Z}_1|. \end{aligned}$$

Combining (146) and (150) we obtain via Cauchy–Schwartz

$$\begin{aligned} |R|\le c_* t^2 mN^{-5/2}\prec \mathcal R. \end{aligned}$$

Therefore, we have

$$\begin{aligned} h_9\sim h_{10}. \end{aligned}$$

Now we expand the exponent in $h_{10}$ in powers of $it({\mathbb Z}_2+{\mathbb Z}_3)$ and obtain $ h_{10}=h_{11}+h_{12}+R$, where

$$\begin{aligned} h_{11}=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_4)\}itZ, \qquad h_{12}=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_4)\}(it)^2Z({\mathbb Z}_2+{\mathbb Z}_3), \end{aligned}$$

and where $|R|\le |t|^3\mathbf{E}|Z|\,|{\mathbb Z}_2+{\mathbb Z}_3|^2$. Combining (146) and (150) we obtain via Cauchy–Schwartz $|R|\le |t|^3mN^{-3}\prec \mathcal R$. Therefore, we have

$$\begin{aligned} h_{10}\sim h_{11}+h_{12}. \end{aligned}$$

We complete the proof of (159) by showing that

$$\begin{aligned} h_{11}\sim \mathbf{E}\exp \{it{\mathbb U}_1\}itZ \qquad {\text {and}} \qquad h_{12}\prec \mathcal R. \end{aligned}$$

(160)

In order to prove the second bound write

$$\begin{aligned} h_{12}=R_2+R_3, \qquad {\text {where}} \qquad R_j=\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb Z}_4)\}(it)^2Z{\mathbb Z}_j. \end{aligned}$$

We shall show that $R_3\prec \mathcal R$. Using the fact that the random variable ${\mathbb U}_1(A_1)$ and the random variables Z, ${\mathbb Z}_3$ and ${\mathbb Z}_4$ are independent we obtain from (165)

$$\begin{aligned} |R_3| \le t^2|\alpha ^{m/2}(t)|\mathbf{E}|Z{\mathbb Z}_3| \le t^2e^{-mt^2/8}m^{1/2}N^{-2} \prec \mathcal R. \end{aligned}$$

In the last step we combined (146), (150) and Cauchy–Schwartz. The proof of $R_1\prec \mathcal R$ is similar.

In order to prove the first relation of (160) we expand the exponent in powers of $it{\mathbb Z}_4$ and obtain $h_{11}=\mathbf{E}\exp \{it{\mathbb U}_1\}itZ+R$. Furthermore, combining (165), (146) and (150) we obtain

$$\begin{aligned} |R| \le t^2|\alpha ^m(t)| \mathbf{E}|Z{\mathbb Z}_4| \le c_*t^2e^{-mt^2/4N}N^{-3/2} \prec \mathcal R. \end{aligned}$$

Hence the first relation of (160). The proof of (153) is complete.

Let us prove (154). By symmetry and the independence,

$$\begin{aligned} \mathbf{E}e^{it{\mathbb U}_1}it{\mathbb U}_3 = \left( {\begin{array}{c}N\\ 3\end{array}}\right) h_{13} \mathbf{E}e^{it{\mathbb U}_*}, \qquad h_{13}=\mathbf{E}e^{itx_1}e^{itx_2}e^{itx_3}itz. \end{aligned}$$

(161)

Here we denote $z=g_3(X_1,X_2,X_3)$ and write,

$$\begin{aligned} {\mathbb U}_1=x_1+x_2+x_3+{\mathbb U}_*, \qquad {\mathbb U}_*=\sum _{4\le j\le N}g_1(X_j), \qquad x_j=g_1(X_j). \end{aligned}$$

Furthermore, write

$$\begin{aligned} r_j=e^{itx_j}-1-itx_j, \qquad v_j=e^{itx_j}-1. \end{aligned}$$

In what follows we expand the exponents in powers of $\textit{itx}_j$, $j=1,2,3$ and use the fact that $\mathbf{E}\bigl ( g_3(X_1,X_2,X_3)\bigl |X_1,X_2\bigr )=0$ as well as the obvious symmetry. Thus, we have

$$\begin{aligned}&h_{13}=h_{14}+R_1, \qquad h_{14}=\mathbf{E}e^{itx_2}e^{itx_3}(it)^2zx_1, \qquad R_1=\mathbf{E}e^{itx_2}e^{itx_3}itzr_1, \\&h_{14}=h_{15}+R_2, \qquad h_{15}=\mathbf{E}e^{itx_3}(it)^3zx_1x_2, \qquad R_2=\mathbf{E}e^{itx_3}(it)^2zx_1r_2 \\&h_{15}=h_{16}+R_3, \qquad h_{16}=\mathbf{E}(it)^4zx_1x_2x_3, \qquad R_3=\mathbf{E}(it)^3zx_1x_2r_3. \end{aligned}$$

Furthermore, we have

$$\begin{aligned} R_1=\mathbf{E}itz_1r_1v_2v_3, \qquad R_2=\mathbf{E}(it)^2zx_1r_2v_3. \end{aligned}$$

Invoking the bounds $|r_j|\le |tx_j|^2$ and $|v_j|\le |tx_j|$ we obtain

$$\begin{aligned} h_{13}=h_{16}+R, \end{aligned}$$

(162)

where $|R|\le c|t|^{5}\mathbf{E}|z x_1x_2|\,x_3^2$. The bound, $|R|\le c_*|t|^5N^{-9/2}$ (which follows, by Cauchy–Schwartz) in combination with (161) and (162) implies

$$\begin{aligned} \mathbf{E}e^{it{\mathbb U}_1}it{\mathbb U}_3 \sim \left( {\begin{array}{c}N\\ 3\end{array}}\right) \mathbf{E}e^{it{\mathbb U}_*} (it)^4w. \end{aligned}$$

(163)

Note that $\left( {\begin{array}{c}N\\ 3\end{array}}\right) |w|\le c_*N^{-1}$. In order to show (154) we replace $\mathbf{E}e^{it{\mathbb U}_*}$ by $e^{-t^2/2}$. Therefore, (154) follows from (163) and the inequalities

$$\begin{aligned} \frac{(it)^4}{N}(\mathbf{E}e^{it{\mathbb U}_*}-e^{-t^2\sigma ^2(N-3)/2N})\prec \mathcal{R}, \quad \frac{(it)^4}{N}(e^{-t^2\sigma ^2(N-3)/2N}-e^{-t^2/2})\prec \mathcal{R}. \end{aligned}$$

The second inequality is a direct consequence of (169). The proof of the first inequality is routine and here omitted. Thus the proof of (140) is complete.

5.2.3 Completion of the proof of (128)

Here we show that

$$\begin{aligned} \mathbf{E}\exp \{it{\mathbb U}_1+{\mathbb U}_2)\}+ \left( {\begin{array}{c}N\\ 3\end{array}}\right) e^{-t^2/2}(it)^4w \sim {\hat{G}}(t). \end{aligned}$$

(164)

This relation in combination with (139) and (140) implies $\mathbf{E}e^{it{\mathbb T}}\sim {\hat{G}}(t)$.

Let $G_U(t)$ denote the two term Edgeworth expansion of the U- statistic ${\mathbb U}_1+{\mathbb U}_2$. That is, $G_U(t)$ is defined by (2), but with $\kappa _4$ replaced by $\kappa _4^*$, where $\kappa _4^*$ is obtained from $\kappa _4$ after removing the summand $4\mathbf{E}g(X_1)g(X_2)g(X_3)\chi (X_1,X_2,X_3)$. Furthermore, let ${\hat{G}}_U(t)$ denote the Fourier transform of $G_U(t)$. It easy to show that

$$\begin{aligned} {\hat{G}}(t)={\hat{G}}_U(t)+ \left( {\begin{array}{c}N\\ 3\end{array}}\right) e^{-t^2/2}(it)^4w. \end{aligned}$$

Therefore, in order to prove (164) it suffices to show that ${\hat{G}}_U(t)\sim \mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}$. The bound

$$\begin{aligned} \int _{|t|\le t_1} |{\hat{G}}_U(t)-\mathbf{E}\exp \{it({\mathbb U}_1+{\mathbb U}_2)\}| \frac{dt}{|t|} \le \varepsilon _NN^{-1} \end{aligned}$$

where $\varepsilon _N\downarrow 0$, was shown by Callaert, Janssen and Veraverbeke [16] and Bickel, Götze and van Zwet [11]. An inspection of their proofs shows that under the moment conditions (5) one can replace $\varepsilon _n$ by $c_*N^{-\nu }$. This completes the proof of (127).

For the reader convenience we formulate in Lemma 3 a known result on upper bounds for characteristic functions.

Lemma 3

Assume that (16) holds. There exists a constant $c_*$ depending on $D_*,M_*, r, s, \nu _1$ only such that, for $N>c_*$ and $|t|\le N^{1/2}/10^3\beta _3$ and $B\subset \Omega _N$, we have

$$\begin{aligned} |\alpha (t)|\le 1-t^2/4N, \qquad \mathbf{E}\exp \{it{\mathbb U}_1(B)\}|\le |\alpha (t)|^{|B|}\le e^{-|B|t^2/4N}. \end{aligned}$$

(165)

Here $\alpha (t)=\mathbf{E}\exp \{itg_1(X_1)\}$ and ${\mathbb U}_1(B)=\sum _{j\in B}g_1(X_j)$.

Proof

Let us prove the first inequality of (165). Expanding the exponent, see (188), we obtain

$$\begin{aligned} |\alpha (t)|&\le \bigl |1-2^{-1}t^2\mathbf{E}g_1^2(X_1)\bigr |+6^{-1}|t|^3\mathbf{E}|g_1(X_1)|^3 \\&= \bigl |1-\sigma ^2t^2/2N\bigr |+\beta _3\sigma ^3|t|^3/6N^{3/2} \end{aligned}$$

Invoking the inequality $1-10^{-3}\le \sigma ^2\le 1$ which follows from (169) for $N>c_*$, where $c_*$ is sufficiently large, we obtain $|\alpha (t)|\le 1-t^2/4N$, for $|t|\le N^{1/2}/10^3\beta _3$.

The second inequality of (165) follows from the first one via the inequality $1+x\le e^x$, for $ x\in R$.$\square $

References

Angst, J., Poly, G.: A weak Cramér condition and application to Edgeworth expansions. Electron. J. Probab. 22(59), 1–24 (2017)
MATH Google Scholar
Babu, G.J., Bai, Z.D.: Edgeworth expansions of a function of sample means under minimal moment conditions and partial Cramer’s condition. Sankhya Ser. A 55, 244–258 (1993)
MathSciNet MATH Google Scholar
Bai, Z.D., Rao, C.R.: Edgeworth expansion of a function of sample means. Ann. Stat. 19, 1295–1315 (1991)
Article MathSciNet Google Scholar
Bentkus, V., Götze, F., van Zwet, W.R.: An Edgeworth expansion for symmetric statistics. Ann. Stat. 25, 851–896 (1997)
Article MathSciNet Google Scholar
Bentkus, V., Götze, F.: Lattice point problems and distribution of values of quadratic forms. Ann. Math. (2) 150(3), 977–1027 (1999)
Article MathSciNet Google Scholar
Bentkus, V., Götze, F.: Optimal bounds in non-Gaussian limit theorems for U-statistics. Ann. Probab. 27, 454–521 (1999)
Article MathSciNet Google Scholar
Bhattacharya, R.N., Rao, R.R.: Normal Approximation and Asymptotic Expansions. Robert E. Krieger Publishing Company, Inc., Malabar (1986)
MATH Google Scholar
Bhattacharya, R.N., Ghosh, J.K.: On the validity of the formal Edgeworth expansion. Ann. Stat. 6, 434–451 (1978)
Article MathSciNet Google Scholar
Bhattacharya, R.N., Ghosh, J.K.: Correction to: On the validity of the formal Edgeworth expansion. Ann. Stat. 8, 1399 (1980)
MATH Google Scholar
Bickel, P.J.: Edgeworth expansions in nonparametric statistics. Ann. Stat. 2, 1–20 (1974)
Article Google Scholar
Bickel, P.J., Götze, F., van Zwet, W.R.: The Edgeworth expansion for $U$-statistics of degree two. Ann. Stat. 14, 1463–1484 (1986)
Article MathSciNet Google Scholar
Bickel, P., et al.: Willem van Zwet’s research. Ann. Stat. 49, 2439–2447 (2021)
Article MathSciNet Google Scholar
Bickel, P.J., Robinson, J.: Edgeworth expansions and smoothness. Ann. Probab. 10, 500–503 (1982)
Article MathSciNet Google Scholar
Bobkov, S.G.: Khinchine’s theorem and Edgeworth approximations for weighted sums. Ann. Stat. 47, 1616–1633 (2019)
Article MathSciNet Google Scholar
Bollobás, B.: Combinatorics. Set Systems, Hypergraphs, Families of Vectors and Combinatorial Probability. Cambridge University Press, Cambridge (1986)
MATH Google Scholar
Callaert, H., Janssen, P., Veraverbeke, N.: An Edgeworth expansion for $U$-statistics. Ann. Stat. 8, 299–312 (1980)
Article MathSciNet Google Scholar
Chibisov, D.M.: Asymptotic expansion for the distribution of a statistic admitting a stochastic expansion. I. Teor. Veroyatn. Primen. 25, 745–757 (1980)
MathSciNet MATH Google Scholar
Chung, K.-L.: The approximate distribution of Student’s statistic. Ann. Math. Stat. 17, 447–465 (1946)
Article MathSciNet Google Scholar
Cramér, H.: Random variables and probability distributions. In: Cambridge Tracts in Mathematics and Mathematical Physics, No. 36, 3rd edn (1970). Cambridge University Press (1937)
Dharmadhikari, S.W., Fabian, V., Jogdeo, K.: Bounds on the moments of martingales. Ann. Math. Stat. 39, 1719–1723 (1968)
Article MathSciNet Google Scholar
Efron, B., Stein, C.: The jackknife estimate of variance. Ann. Stat. 9, 586–596 (1981)
Article MathSciNet Google Scholar
Esseen, C.G.: Fourier analysis of distribution functions. A mathematical study of the Laplace–Gaussian law. Acta Math. 77, 1–125 (1945)
Article MathSciNet Google Scholar
Götze, F.: Asymptotic expansions for bivariate von Mises functionals. Z. Wahrsch. Verw. Gebiete 50, 333–355 (1979)
Article MathSciNet Google Scholar
Götze, F.: Lattice point problems and values of quadratic forms. Invent. Math. 157, 195–226 (2004)
Article MathSciNet Google Scholar
Götze, F., van Zwet W.R.: Edgeworth expansions for asymptotically linear statistics. Manuscript 1–45 (1992)
Götze, F., van Zwet, W.R.: An Expansion for a Discrete Non-lattice Distribution. Frontiers in Statistics, pp. 257–274. Imperial College, London (2006)
MATH Google Scholar
Götze, F., Zaitsev, A.: Explicit rates of approximation in the CLT for quadratic forms. Ann. Probab. 42, 354–397 (2014)
MathSciNet MATH Google Scholar
Hall, P.: Edgeworth expansion for Student’s t statistic under minimal moment conditions. Ann. Probab. 15, 920–931 (1987)
Article MathSciNet Google Scholar
Helmers, R.: Edgeworth Expansions for Linear Combinations of Order Statistics. Mathematical Centre Tracts, vol. 105. CWI, Amsterdam (1982)
MATH Google Scholar
Hodges, J.L., Jr., Lehmann, E.L.: Deficiency. Ann. Math. Stat. 41, 783–801 (1970)
Article MathSciNet Google Scholar
Hoeffding, W.: A class of statistics with asymptotically normal distribution. Ann. Math. Stat. 19, 293–325 (1948)
Article MathSciNet Google Scholar
Ledoux, M., Talagrand, M.: Probability in Banach Spaces. Isoperimetry and Processes. Springer, Berlin (1991)
Book Google Scholar
Petrov, V.V.: Sums of Independent Random Variables. Springer, New York (1975)
Book Google Scholar
Pfanzagl, J.: Asymptotic expansions for general statistical models. With the assistance of W. Wefelmeyer. In: Lecture Notes in Statistics, vol. 31. Springer, Berlin (1985)
Serfling, R.J.: Approximation Theorems of Mathematical Statistics. Wiley, New York (1980)
Book Google Scholar
Yurinskii, V.V.: Exponential inequalities for sums of random vectors. J. Multivar. Anal. 6, 473–499 (1976)
Article MathSciNet Google Scholar
van Zwet, W.R.: A Berry–Esseen bound for symmetric statistics. Z. Wahrsch. Verw. Gebiete 66, 425–440 (1984)
Article MathSciNet Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Faculty of Mathematics and Informatics, Vilnius University, Vilnius, Lithuania
Mindaugas Bloznelis
Faculty of Mathematics, University of Bielefeld, Bielefeld, Germany
Friedrich Götze

Authors

Mindaugas Bloznelis
View author publications
You can also search for this author in PubMed Google Scholar
Friedrich Götze
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Friedrich Götze.

Additional information

In memoriam Willem Rutger van Zwet *March 31, 1934 $${\dagger }$$ † July 2, 2020.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Research funded in part by the German Research Foundation—SFB 1283/2 2021—317210226.

Appendices

Appendix 1

In Lemma 4 below we compare the moments $\Delta _m^2$ and $\mathbf{E}R_m^2$, where $R_m$ is the remainder of expansion (14),

$$\begin{aligned} \mathbb T=\mathbf{E}\mathbb T+ \mathbb U_1+\dots +\mathbb U_{m-1}+R_m, \qquad R_m:=\mathbb U_m+\dots +\mathbb U_N. \end{aligned}$$

For $k=1,\dots , N$, write $\Omega _k=\{1,2,\dots , k\}$ and denote $ \sigma _k^2:=\mathbf{E}g_k^2(X_1,\dots , X_k)=\mathbf{E}T_{\Omega _k}^2$. It follows from (14), by the orthogonality property (15), that

$$\begin{aligned} \sigma _\mathbb T^2 = \sum _{k=1}^N\mathbf{E}{\mathbb U}_k^2, \qquad \mathbf{E}R_m^2 = \sum _{k=m}^N\mathbf{E}{\mathbb U}_k^2, \qquad \mathbf{E}{\mathbb U}_k^2 = \left( {\begin{array}{c}N\\ k\end{array}}\right) \sigma _k^2. \end{aligned}$$

(166)

Lemma 4

Assume that $\mathbf{E}\mathbb T^2<\infty $. Then

$$\begin{aligned}&\mathbf{E}R_m^2\le N^{-(m-1)}\Delta _m^2, \end{aligned}$$

(167)

$$\begin{aligned}&\Delta _m^2\le N^{2m-1}\sigma _m^2+N^{-1}\Delta _{m+1}^2, \end{aligned}$$

(168)

Assume that (5) and (6) hold, then there exists a constant $c_*<\infty $ depending on $D_*,M_*,r,s, \nu _1$ such that

$$\begin{aligned} 0\le 1-\sigma ^2\sigma _\mathbb T^{-2}\le c_*N^{-1}. \end{aligned}$$

(169)

Remark. For $m=3$, inequality (168) yields $\Delta _3^2\le \zeta _2+N^{-1}\Delta _4^2$.

Proof

Let us prove (167). The identity

$$\begin{aligned} D_1\cdots D_m\mathbb T= \sum _{A:\, \Omega _m\subset A\subset \Omega _N}T_A = \sum _{m\le k\le N}\mathbb U_{k|m}, \end{aligned}$$

where $\mathbb U_{k|m}=\sum _{|A|=k,\, A\supset \Omega _m}T_A$, implies

$$\begin{aligned} \mathbf{E}(D_1\cdots D_m\mathbb T)^2=\sum _{m\le k\le N}\mathbf{E}\mathbb U_{k|m}^2, \qquad \mathbf{E}\mathbb U_{k|m}^2=\sigma _k^2 \left( {\begin{array}{c}N-m\\ k-m\end{array}}\right) . \end{aligned}$$

(170)

We have

$$\begin{aligned} \mathbf{E}(D_1D_2\cdots D_m\mathbb T)^2 = \sum _{m\le k\le N}\sigma _k^2 \left( {\begin{array}{c}N-m\\ k-m\end{array}}\right) = \sum _{m\le k\le N}\sigma _k^2 \left( {\begin{array}{c}N\\ k\end{array}}\right) b_k, \end{aligned}$$

(171)

where $b_k=[k]_m/[N]_m$ satisfies $b_k\ge b_m\ge m!N^{-m}$. Here we denote $[x]_m=x(x-1)\cdots (x-m+1)$. A comparison of (166) and (171) shows (167)

$$\begin{aligned} \mathbf{E}R_m^2 \le N^m\mathbf{E}(D_1\cdots D_m\mathbb T)^2 = N^{-(m-1)}\Delta _m^2. \end{aligned}$$

Let us prove (168). We have

$$\begin{aligned} \mathbf{E}(D_1\cdots D_m\mathbb T)^2&= \sigma _m^2 + \sum _{m<k\le N}\sigma _k^2 \left( {\begin{array}{c}N-m\\ k-m\end{array}}\right) \\&= \sigma _m^2 + \sum _{m<k\le N}\sigma _k^2 \left( {\begin{array}{c}N-m-1\\ k-m-1\end{array}}\right) {\tilde{b}}_k, \end{aligned}$$

where ${\tilde{b}}_k=(N-m)/(k-m)\le N$. We obtain the inequality

$$\begin{aligned} \mathbf{E}(D_1\cdots D_m\mathbb T)^2\le \sigma _m^2+N\mathbf{E}(D_1\cdots D_{m+1}\mathbb T)^2 \end{aligned}$$

which implies (168).

Let us prove (169). From (166), (167) we have, for $\sigma ^2=N\sigma _1^2$,

$$\begin{aligned} 0 \le 1-\frac{\sigma ^2}{\sigma _{\mathbb T}^2} \le \left( {\begin{array}{c}N\\ 2\end{array}}\right) \frac{\sigma _2^2}{\sigma _{\mathbb T}^2} + \left( {\begin{array}{c}N\\ 3\end{array}}\right) \frac{\sigma _3^2}{\sigma _{\mathbb T}^2} +\frac{1}{N^3} \frac{\Delta _4^2}{\sigma _{\mathbb T}^2}. \end{aligned}$$

Invoking the bounds, which follow from (5),

$$\begin{aligned} N^3\sigma _2^2=\mathbf{E}\psi ^2(X_1,X_2)\le M_*^{2/r}\sigma _{\mathbb T}^2, \qquad N^5\sigma _3^2=\mathbf{E}\chi ^2(X_1,X_2,X_3)\le M_*^{2/s}\sigma _{\mathbb T}^2 \end{aligned}$$

and using (6) we obtain (169). $\square $

In Lemma 5 below we establish moments bounds for various parts of Hoeffding decomposition defined in Sect. 2.

Lemma 5

Assume that $\sigma _\mathbb T^2=1$. For $3\le m\le N$ and $s>2$, we have

$$\begin{aligned}&\mathbf{E}\Lambda _3^2\le \frac{m^3}{N^{5}}\Delta _3^2, \quad \mathbf{E}\Lambda _2^2\le \frac{m^2}{N^{4}}\Delta _3^2, \quad \mathbf{E}|\Lambda _1|^3\le c\frac{m^3}{N^{9/2}}\gamma _3, \end{aligned}$$

(172)

$$\begin{aligned}&\mathbf{E}|\Lambda _4|^s \le c(s)\, m^{s/2}N^{-3s/2}\zeta _s, \quad \mathbf{E}\eta _i^2 \le N^{-4}\Delta _4^2, \quad \mathbf{E}\Lambda _5^2\le mN^{-4}\Delta _4^2. \end{aligned}$$

(173)

Here c denotes an absolute constant and c(s) denotes a constant which depends only on s.

Proof

The inequalities (172) are proved in [4].

Let us prove (173). Split $\Lambda _4=z_1+\dots +z_m$, where

$$\begin{aligned} z_i=\sum _{|A|=3,\, A\cap \Omega _m=i}T_A. \end{aligned}$$

Let $\mathbf{E}'$ denote the conditional expectation given $X_{m+1},\dots , X_N$. It follows from Rosenthal’s inequality that almost surely

$$\begin{aligned} \mathbf{E}'|\Lambda _4|^s \le c(s)\sum _{i=1}^m\mathbf{E}'|z_i|^s+c(s)\bigl (\sum _{i=1}^m\mathbf{E}' z_i^2\bigr )^{s/2}. \end{aligned}$$

Invoking Hölder’s inequality we obtain, by symmetry,

$$\begin{aligned} \mathbf{E}|\Lambda _4|^s = \mathbf{E}\mathbf{E}'|\Lambda _4|^s \le c(s)m^{s/2}\mathbf{E}|z_1|^s. \end{aligned}$$

(174)

Using well known martingale moment inequalities (and their applications to U statistics), see [20], one can show the bound $\mathbf{E}|z_1|^s\le c(s)N^{-3s/2}\zeta _s$. Invoking this bound in (174) we obtain the first bound of (173).

In order to prove the second bound of (173) write

$$\begin{aligned} \eta _i=\sum _{k=4}^{N-m+1} U^*_k, \qquad U^*_k=\sum _{|A|=k,\, A\cap \Omega _m=\{i\}}T_A. \end{aligned}$$

A simple calculation shows $\mathbf{E}(U^*_k)^2= \left( {\begin{array}{c}N-m\\ k-1\end{array}}\right) \sigma _k^2$. Therefore, by orthogonality,

$$\begin{aligned} \mathbf{E}\eta _i^2&= \sum _{k=4}^{N-m+1} \left( {\begin{array}{c}N-m\\ k-1\end{array}}\right) \sigma _k^2 = \sum _{k=4}^{N-m+1} \left( {\begin{array}{c}N-4\\ k-4\end{array}}\right) b_k\sigma _k^2 \nonumber \\&\le N^3\mathbf{E}(D_1\cdots D_4\mathbb T)^2. \end{aligned}$$

(175)

In the last step we invoke (170) and use the bound $b_k\le N^3$, where $b_k= \left( {\begin{array}{c}N-m\\ k-1\end{array}}\right) \left( {\begin{array}{c}N-4\\ k-4\end{array}}\right) ^{-1}$. Clearly, (175) implies $\mathbf{E}\eta _i^2\le N^{-4}\Delta _4^2$. Finally, using the fact that $\eta _1,\dots , \eta _m$ are uncorrelated we obtain

$$\begin{aligned} \Lambda _5^2=\mathbf{E}\eta _1^2+\dots +\mathbf{E}\eta _m^2\le m\,N^{-4}\Delta _4^2 \end{aligned}$$

thus completing the proof.$\square $

Before formulating next result we introduce some notation. Given m let $\mathcal D$ denote the class of subsets $A\subset \Omega _N$ satisfying $|A|\ge 4$ and $\Omega _m\cap A\not =\emptyset $. Introduce the random variable ${\mathbb H}(m)=\sum _{A\in \mathcal D}T_A$. Denote $x_i=2i-1$ and $y_i=2i$. For even integer $m=2k\le N$ write

$$\begin{aligned} \Omega _m=A_k\cup B_k, \qquad A_k=\{x_1,\dots , x_k\}, \qquad B_k=\{y_1,\dots , y_k\} \end{aligned}$$

and put $A_0=B_0=\emptyset $. Let $\mathcal{A}(k)$ (respectively $\mathcal{B}(k)$) denote the collection of those $A\in \mathcal D$ which satisfy $A\cap A_k=\emptyset $ (respectively $A\cap B_k=\emptyset $). Furthermore, let $\mathcal{C}(k)$ denote the collection of $A\in \mathcal{D}$ such that $A\cap A_k\not =\emptyset $ and $A\cap B_k\not =\emptyset $. Write

$$\begin{aligned} {\mathbb H}_A(k)=\sum _{A\in \mathcal{A}(k)}T_A, \qquad {\mathbb H}_B(k)=\sum _{A\in \mathcal{B}(k)}T_A, \qquad {\mathbb H}_C(k)=\sum _{A\in \mathcal{C}(k)}T_A. \end{aligned}$$

Lemma 6

There exists an absolute constant c such that,

$$\begin{aligned} \mathbf{E}{\mathbb H}^2(m)\le c\frac{m}{N^4}\Delta _4^2, \qquad {\text {for}} \qquad m=4,5,\dots , N. \end{aligned}$$

(176)

For even integer $m=2k<N$ we have

$$\begin{aligned} \mathbf{E}{\mathbb H}_A^2(k)=\mathbf{E}{\mathbb H}_B^2(k)\le c\frac{k}{N^4}\Delta _4^2, \qquad \mathbf{E}{\mathbb H}_C^2(k)\le c \frac{k^2}{N^5}\Delta _4^2. \end{aligned}$$

(177)

Proof

Let us prove the first bound of (176). For $m=4$ we have

$$\begin{aligned} {\mathbb H}(4)=H_1+H_2+H_3+H_4, \qquad H_k=\sum _{|A|\ge 4,\, |A\cap \Omega _4|=k}T_A. \end{aligned}$$

A calculation shows that, for $k=1,2,3,4$,

$$\begin{aligned} \mathbf{E}H_k^2 = \left( {\begin{array}{c}4\\ k\end{array}}\right) \sum _{j=4}^N\sigma _j^2 \left( {\begin{array}{c}N-4\\ j-k\end{array}}\right) = \left( {\begin{array}{c}4\\ k\end{array}}\right) \sum _{j=4}^N\sigma _j^2 \left( {\begin{array}{c}N-4\\ j-4\end{array}}\right) a_k(j), \end{aligned}$$

where the numbers

$$\begin{aligned} a_k(j) = \frac{ \left( {\begin{array}{c}N-4\\ j-k\end{array}}\right) }{ \left( {\begin{array}{c}N-4\\ j-4\end{array}}\right) } \le N^{4-k}. \end{aligned}$$

Invoking (171) we obtain

$$\begin{aligned} \mathbf{E}H_k^2\le c\,N^{4-k} \mathbf{E}(D_1\cdots D_4\mathbb T)^2=cN^{-3-k}\Delta _4^2. \end{aligned}$$

(178)

Finally, we obtain (176) for $m=4$

$$\begin{aligned} \mathbf{E}{\mathbb H}^2(4)=\mathbf{E}H_1^2+\dots +\mathbf{E}H_4^2\le cN^{-4}\Delta _4^2. \end{aligned}$$

In order to prove (176) for $m=5,6,\dots $ we apply a recursive argument. Write

$$\begin{aligned} \mathbf{E}{\mathbb H}^2(m+1)=\mathbf{E}{\mathbb H}^2(m)+\mathbf{E}d_m^2, \end{aligned}$$

(179)

where $d_m={\mathbb H}(m+1)-{\mathbb H}(m)$ is the sum of those $T_A$ with $|A|\ge 4$ satisfying $A\cap \Omega _m=\emptyset $ and $A\cap \Omega _{m+1}\not =\emptyset $. In particular, we have

$$\begin{aligned} d_m=\sum _{|A|\ge 3,\, A\cap \Omega _{m+1}=\emptyset }T_{A\cup \{m+1\}}. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathbf{E}d_m^2 = \sum _{j=4}^N\sigma _j^2 \left( {\begin{array}{c}N-m-1\\ j-1\end{array}}\right) = \sum _{j=4}^N\sigma _j^2 \left( {\begin{array}{c}N-4\\ j-4\end{array}}\right) c_j, \end{aligned}$$

where the numbers

$$\begin{aligned} c_j=\frac{ \left( {\begin{array}{c}N-m-1\\ j-1\end{array}}\right) }{ \left( {\begin{array}{c}N-4\\ j-4\end{array}}\right) }\le N^3. \end{aligned}$$

Invoking (171) we obtain $\mathbf{E}d_m^2\le N^{-4}\Delta _4^2$. This bound together with (179) implies (176).

Let us prove (177). Note that for $m=2k$ we have ${\mathbb H}(m)={\mathbb H}_A(k)+{\mathbb H}_B(k)+{\mathbb H}_C(k)$ and the summands are uncorrelated. Therefore, the first bound of (177) follows from (176).

Let us show the second inequality of (177). For $k=2$ we have $\mathcal{C}(2)\subset \mathcal{C}$, where $\mathcal C$ denotes the class of subsets $A\subset \Omega _N$ such that $|A|\ge 4$ and $|A\cap \Omega _4|\ge 2$. Write ${\mathbb H}_C=\sum _{A\in \mathcal C}T_A$. We have

$$\begin{aligned} \mathbf{E}{\mathbb H}_C^2(2)\le \mathbf{E}{\mathbb H}_C^2=\mathbf{E}H^2_2+\mathbf{E}H^2_3+\mathbf{E}H_4^2 \le cN^{-5}\Delta _4^2. \end{aligned}$$

In the last step we applied (178). We obtain (177), for $k=2$.

In order to prove the bound (177), for $k=3,4,\dots $, we apply a recursive argument similar to that used in the proof of (176). Denote

$$\begin{aligned} d_{[k]}={\mathbb H}_C(k+1)-{\mathbb H}_C(k)=\sum _{A\in \mathcal{C}(k+1)\setminus \mathcal{C}(k)}T_A. \end{aligned}$$

We shall show that

$$\begin{aligned} \mathbf{E}d_{[k]}^2\le ckN^{-5}\Delta _4^2. \end{aligned}$$

(180)

This bound in combination with the identity $\mathbf{E}{\mathbb H}_C^2(k+1)=\mathbf{E}{\mathbb H}_C^2(k)+\mathbf{E}d_{[k]}^2$ shows (177) for arbitrary k.

In order to show (180) split the set $\mathcal{C}(k+1)\setminus \mathcal{C}(k)$ into $2k+1$ non-intersecting parts

$$\begin{aligned} \mathcal{C}(k+1)\setminus \mathcal{C}(k) = \bigl (\cup _{i=1}^k\mathcal{C}_{x.i}\bigr ) \cup \bigl (\cup _{i=1}^k\mathcal{C}_{y.i}\bigr ) \cup \mathcal{C}_{x.y}, \end{aligned}$$

where we denote

$$\begin{aligned}&\mathcal{C}_{x.y}=\bigl \{A={\tilde{A}}\cup \{x_{k+1},y_{k+1}\}: \ {\tilde{A}}\cap (B_k\cup A_k)=\emptyset ,\ |\tilde{A}|\ge 2\bigr \}, \\&\mathcal{C}_{x.i}=\bigl \{A={\tilde{A}}\cup \{y_{k+1},x_i\}: \ {\tilde{A}}\cap (B_k\cup A_{i-1})=\emptyset ,\ |\tilde{A}|\ge 2\bigr \}, \\&\mathcal{C}_{y.i}=\bigl \{A={\tilde{A}}\cup \{x_{k+1},y_i\}: \ {\tilde{A}}\cap (B_{i-1}\cup A_k)=\emptyset ,\ |\tilde{A}|\ge 2\bigr \}. \end{aligned}$$

By the orthogonality property ($\mathbf{E}T_AT_{V}=0$ for $A\not =V$), the random variables

$$\begin{aligned} d_{x.i}=\sum _{A\in \mathcal{C}_{x.i}}T_A, \qquad d_{y.i}=\sum _{A\in \mathcal{C}_{y.i}}T_A, \qquad d_{x.y}=\sum _{A\in \mathcal{C}_{x.y}}T_A \end{aligned}$$

are uncorrelated. Therefore, we have

$$\begin{aligned} \mathbf{E}d_{[k]}^2=\mathbf{E}d_{x.y}^2+\sum _{i=1}^k(\mathbf{E}d_{x.i}^2+\mathbf{E}d_{y.i}^2). \end{aligned}$$

(181)

A calculation shows that

$$\begin{aligned} \mathbf{E}d_{x.y}^2 = \sum _{j=4}^N\sigma _j^2 \left( {\begin{array}{c}N-2k-2\\ j-2\end{array}}\right) = \sum _{j=4}^N\sigma _j^2 \left( {\begin{array}{c}N-4\\ j-4\end{array}}\right) v_j, \end{aligned}$$

where the coefficients

$$\begin{aligned} v_j=\frac{ \left( {\begin{array}{c}N-2k-2\\ j-2\end{array}}\right) }{ \left( {\begin{array}{c}N-4\\ j-4\end{array}}\right) }\le N^2. \end{aligned}$$

Invoking (171) we obtain $\mathbf{E}d_{x.y}^2\le N^{-5}\Delta _4^2$. The same argument shows $\mathbf{E}d_{x.i}^2=\mathbf{E}d_{y.i}^2\le N^{-5}\Delta _4^2$. The latter bound in combination with (181) shows (180). The lemma is proved.$\square $

Appendix 2

Here we construct bounds for the probability density function (and its derivatives) of random variables $g_k^*=(N/M)^{1/2}g_k$, for $1\le k\le n-1$, where $g_k$ are defined in (74). Since these random variables are identically distributed it suffices to consider

$$\begin{aligned} g_1^* = \bigl (\frac{N}{M}\bigr )^{1/2}g_1 = \frac{1}{\sqrt{M}}\sum _{j=m+1}^{m+M}g(Y_j) + \frac{\xi _1}{R}. \end{aligned}$$

Here $R=\sqrt{n\,M\,N\,}$. Introduce the random variables

$$\begin{aligned} g_2^*=g_1^*-M^{-1/2}g(Y_{m+1}), \qquad g_3^*=g_1^*-M^{-1/2}\bigl (g(Y_{m+1})+g(Y_{m+2})\bigr ). \end{aligned}$$

Let $p_i(\cdot )$ denote the probability density function of $g_i^*$, for $i=1,2,3$. Recall that the integers $n\approx N^{50\nu }\le N^{\nu _2/10}$ and $M\approx N/n \ge N^{9/10}$ are introduced in (29) and the number $\nu >0$ is defined by (17).

Lemma 7

Assume that conditions of Theorem 1 are satisfied. There exist positive constants $C_*, c_*, c_*'$ depending only on $M_*, D_*, \delta , r$ and $\nu _1, \nu $ such that, for $i=1,2,3$, we have uniformly in $u\in \mathbb R$ and $N>C_*$

$$\begin{aligned} |p_i(u)|\le c_*, \quad |p_i'(u)|\le c_*, \quad |p_i''(u)|\le c_*, \quad |p_i'''(u)|\le c_*. \end{aligned}$$

(182)

Furthermore, given $w>0$ there exists a constant $C_*(w)$ depending on $M_*, D_*, \delta , r$, $\nu _1, \nu $ and w such that uniformly in $z_*\in [-2w,2w]$ and $N>C_*(w)$ we have

$$\begin{aligned} p_i(z_*)\ge c'_*, \qquad i=1,2,3. \end{aligned}$$

(183)

Proof

We shall prove (182) and (183) for $i=1$. For $i=2,3$, the proof is almost the same. Before starting the proof we introduce some notation and collect auxiliary results.

Denote

$$\begin{aligned}&\theta =\mathbf{E}g_1^*=M^{1/2}\theta _1, \qquad \theta _1=\mathbf{E}g(Y_{m+1}), \\&s^2=\mathbf{E}(g(Y_{m+1})-\theta _1)^2, \qquad {\tilde{\beta }}_3=s^{-3}\mathbf{E}|g(Y_{m+1})-\theta _1|^3 \end{aligned}$$

and recall that $q_N=\mathbf{P}\{A_j\}$, where $A_j=\{\Vert Z_j'\Vert _r\le N^{\alpha }\}$. It follows from $\mathbf{E}g(X_{m+1})=0$ that

$$\begin{aligned} \theta _1 = q_N^{-1}\mathbf{E}g(X_{m+1})\mathbb I_{A_{m+1}} = -q_N^{-1}\mathbf{E}g(X_{m+1})(1-\mathbb I_{A_{m+1}}). \end{aligned}$$

Therefore, by Chebyshev’s inequality, for $\alpha =3/(r+2)$ we have

$$\begin{aligned} |\theta _1|\le q_{N}^{-1}N^{-\alpha (r-1)}\mathbf{E}|g(X_{m+1})|\,\Vert Z_{m+1}'\Vert _r^{r-1} \le c_*N^{-3/2}. \end{aligned}$$

(184)

In the last step we invoke the inequalities $\alpha (r-1)\ge 1+(r-1)/(r+2)\ge 3/2$ and $q_N^{-1}\le c_*$, see (45), and $\mathbf{E}|g(X_{m+1})|\,\Vert Z_{m+1}'\Vert _r^{r-1}\le M_*$, where the latter inequality follows from (5) by Hölder inequality.

Similarly, the identities

$$\begin{aligned} s^2 = q_N^{-1}\mathbf{E}g^2(X_{m+1})\mathbb I_{A_{m+1}}-\theta _1^2 = q_N^{-1}\sigma ^2-q_N^{-1}\mathbf{E}g^2(X_{m+1})(1-\mathbb I_{A_{m+1}})-\theta _1^2 \end{aligned}$$

in combination with (44) and the inequalities

$$\begin{aligned} \mathbf{E}g^2(X_{m+1})(1-\mathbb I_{A_{m+1}}) \le N^{-\alpha (r-2)} \mathbf{E}g^2(X_{m+1})\,\Vert Z_{m+1}'\Vert _r^{r-2}\le N^{-\alpha (r-2)}M_* \end{aligned}$$

and $\alpha (r-2)=1+2(r-4)/(r-2)\ge 1$ yield

$$\begin{aligned} |s^2-\sigma ^2|\le c_*N^{-1}. \end{aligned}$$

(185)

Introduce the random variables

$$\begin{aligned} g_*=S+\frac{\xi _1}{s\,R}, \quad S=w_1+\dots +w_M, \quad w_j=\frac{g(Y_{m+j})-\theta _1}{M^{1/2}s}. \end{aligned}$$

We have $g_*=s^{-1}(g_1^*-\theta )$. Let $p(\cdot )$ denote the density function of $g_*$. Note that $p_1(u)=s^{-1}p\bigl (s^{-1}(u-\theta )\bigr )$. Furthermore, we have, by (184), $|\theta |\le c_*N^{-1}$ and, by (185), (169), $|s^2-1|\le c_*N^{-1}$. Therefore, it suffices to prove (182) and (183) for $p(\cdot )$ (the latter inequality we verify for every $z_*\in [-3w,3w]$).

In order to prove (182) and (183) we approximate the characteristic function ${\hat{p}}(t)=\mathbf{E}e^{itg_*}$ by $e^{-t^2/2}$ and then apply a Fourier inversion formula. Write

$$\begin{aligned} {\hat{p}}(t) = \mathbf{E}e^{itg_*} = \gamma ^M(t)\tau \Bigl (\frac{t}{s R}\Bigr ), \qquad \gamma (t):=\mathbf{E}e^{itw_1}, \quad \tau (t):=\mathbf{E}e^{it\xi _1}. \end{aligned}$$

The fact that $\tau (t)=0$, for $|t|\ge 1$, implies ${\hat{p}}(t)=0$, for $|t|>s\, R$. Therefore, we obtain from the Fourier inversion formula,

$$\begin{aligned} p(x)= \frac{1}{2\pi }\int _{-\infty }^{+\infty } e^{-itx}{\hat{p}}(t)dt = \frac{1}{2\pi }\int _{-s\, R}^{s\, R} e^{-itx}{\hat{p}}(t)dt. \end{aligned}$$

Write ${\hat{p}}(t)-e^{-t^2/2}=r_1(t)+r_2(t)$, where

$$\begin{aligned} r_1(t)=(\gamma ^M(t)-e^{-t^2/2})\tau (t/sR), \qquad r_2(t)=e^{-t^2/2}\bigl (\tau (t/sR)-1\bigr ). \end{aligned}$$

We shall show below that

$$\begin{aligned} \int _{|t|\le sR}|r_i(t)|dt\le c_*M^{-1/2}, \qquad i=1,2. \end{aligned}$$

(186)

These bounds in combination with the simple inequality

$$\begin{aligned} \int _{|t|\ge sR}e^{-t^2/2}dt\le c_*M^{-1/2} \end{aligned}$$

show that

$$\begin{aligned} |p(x)-\varphi (x)|\le c_*M^{-1/2}, \qquad x\in \mathbb R. \end{aligned}$$

(187)

Here $\varphi $ denotes the standard normal density function

$$\begin{aligned} \varphi (x) = \frac{1}{\sqrt{2\pi }}e^{-x^2/2} = \frac{1}{2\pi }\int _{-\infty }^{+\infty }e^{-itx}e^{-t^2/2}dt. \end{aligned}$$

It follows from (187) that

$$\begin{aligned} |p(x)|\le c_*, \qquad x\in \mathbb R. \end{aligned}$$

Furthermore, given w we have uniformly in $|z_*|\le 3w$

$$\begin{aligned} |p(z_*)|\ge \varphi (3w)-c_*M^{-1/2}\ge c_*'>0, \end{aligned}$$

for sufficiently large M (for $N>C_*(w)$).

In order to prove an upper bound for the $k-$th derivative, $|p^{(k)}(x)|\le c_*$, write

$$\begin{aligned} p^{(k)}(t) = \frac{1}{2\pi }\int _{-\infty }^{+\infty }(-it)^k\exp \{-itx\}{\hat{p}}(t)dt, \qquad k=1,2,3, \end{aligned}$$

and replace ${\hat{p}}(t)$ by $e^{-t^2/2}$ as in the proof of (187). We obtain

$$\begin{aligned} p^{(k)}(x) = \frac{1}{2\pi }\int _{-\infty }^{+\infty }(-it)^k\exp \{-itx\}e^{-t^2/2}(t)dt +r, \qquad |r|\le c_*M^{-1/2}. \end{aligned}$$

This implies $|p^{(k)}(x)-\varphi ^{(k)}(x)|\le c_*M^{-1/2}$. We arrive at the desired bound $|p^{(k)}(x)|\le c_*$, for $k=1,2,3$.

In the remaining part of the proof we verify (186). For $i=2$ this bound follows from $|\tau (t/sR)-1|\le ct^2/(sR)^2$. The latter inequality is a consequence of the short expansion

$$\begin{aligned} \bigl | \mathbf{E}\exp \{it\xi _1/sR\}-1-\mathbf{E}it\xi _1/sR \bigr | \le \mathbf{E}(t\xi _1)^2/2(sR)^2 \end{aligned}$$

and $\mathbf{E}\xi _1=0$ and $\mathbf{E}\xi _1^2\le c$, for some absolute constant c.

Let us prove (186) for $i=1$. Introduce the sequence of i.i.d. centered Gaussian random variables $\eta _1,\,\eta _2,\,\dots $ with variances $\mathbf{E}\eta _i^2=M^{-1}$. Denote

$$\begin{aligned} f(t)=\mathbf{E}e^{it\eta _1}= e^{-t^2/(2M)} \qquad {\text {and}} \qquad \delta (t)=\gamma (t)-f(t). \end{aligned}$$

We are going to apply the well known inequality

$$\begin{aligned} \left| e^{iv} - \left( 1+\frac{iv}{1!}+\frac{(iv)^2}{2!}+\dots +\frac{(iv)^{k-1}}{(k-1)!} \right) \right| \le \frac{|v|^k}{k!}. \end{aligned}$$

(188)

It follows from (188) and identities $\mathbf{E}\eta _1^i=\mathbf{E}w_1^i$, $i=1,2$, that

$$\begin{aligned} |\delta (t)| \le \frac{|t|^3}{3!}\bigl (\mathbf{E}|w_1|^3+\mathbf{E}|\eta _1|^3\bigr )\le c|t|^3\mathbf{E}|w_1|^3. \end{aligned}$$

(189)

Here we use the inequality $\mathbf{E}|\eta _1|^3\le c\mathbf{E}|w_1|^3$, which follows from $\mathbf{E}\eta _1^2=\mathbf{E}w_1^2$.

Combining (189) and the simple identity

$$\begin{aligned} \gamma ^M(t)-f^M(t) = \delta (t) \sum _{k=1}^M\gamma ^{M-k}(t)f^{k-1}(t) \end{aligned}$$

we obtain

$$\begin{aligned} |\gamma ^M(t)-f^M(t)|\le c|t|^3Z(t)\,M^{-1/2}{\tilde{\beta }}_3. \end{aligned}$$

(190)

Here we denote

$$\begin{aligned} Z(t)=\max _{r+v=M-1}|f^r(t)\gamma ^v(t)|. \end{aligned}$$

We shall show below that

$$\begin{aligned} Z(t)\le \exp \Bigl \{-\frac{t^2}{3}\frac{M-1}{M}\Bigr \}+\exp \{-\delta ''(M-1)/2\}, \qquad 0\le |t|\le sR, \end{aligned}$$

(191)

where $\delta ''>0$ depends on $\delta , A_*, D_*, M_*, \nu _1$ and it is given in (36). This inequality in combination with (190) proves (186).

Let us prove (191). Clearly, $Z\le |f^{M-1}(t)|+|\gamma ^{M-1}(t)|$. Furthermore, $f^M(t)=e^{-t^2/2}$. In order to prove (191) we shall show

$$\begin{aligned}&|\gamma ^M(t)|\le e^{-t^2/3}, \qquad 0\le |t|\le M^{1/2}/{\tilde{\beta }}_3, \end{aligned}$$

(192)

$$\begin{aligned}&|\gamma (t)|\le e^{-\delta ''/2}, \qquad M^{1/2}/{\tilde{\beta }}_3\le |t|\le sR. \end{aligned}$$

(193)

To show (192) we expand $e^{itw_1}$ using (188),

$$\begin{aligned} |\gamma (t)| = \bigl | \mathbf{E}e^{itw_1}\bigr |&\le \bigl | 1-\frac{t^2}{2}\mathbf{E}w_1^2 \bigr |+\frac{|t|^3}{3!}\mathbf{E}|w_1|^3 \\&= \bigl | 1-\frac{t^2}{2M} \bigr |+\frac{|t|^3}{3!}\frac{{\tilde{\beta }}_3}{M^{3/2}} \\&= 1-\frac{t^2}{2M}\left( 1-\frac{|t|}{3}\frac{{\tilde{\beta }}_3}{\sqrt{M}}\right) \\&\le 1-\frac{t^2}{3M}. \end{aligned}$$

Here we used the identity $|1-t^2/2M|=1-t^2/2M$, which holds for $|t|<M^{1/2}/{\tilde{\beta }}_3$, since ${\tilde{\beta }}_3\ge 1$. Finally, an application of the inequality $1-x\le e^{-x}$ to $x=t^2/3M>0$ completes the proof of (192).

Let us prove (193). For $\delta ''$ defined by (36) we shall show $\delta ''\le 2{\tilde{\delta }}$, where

$$\begin{aligned} {\tilde{\delta }}&= 1-\sup \{|\gamma (t)|:\, M^{1/2}{\tilde{\beta }}_3^{-1}\le |t|\le sR\} \\&= 1 - \sup \{ |\mathbf{E}\exp \{iu\sigma ^{-1}g(Y_{m+1})|:\, \sigma /s\,{\tilde{\beta }}_3\le |u|\le \sigma \sqrt{n\, N} \}. \end{aligned}$$

We are going to replace $g(Y_{m+1}),\, {\tilde{\beta }}_3,\, s^2$ by $g(X_{m+1}), \,\beta _3, \,\sigma ^2$ respectively. Write

$$\begin{aligned}&\mathbf{E}e^{ivg(Y_{m+1})} = q_N^{-1}\mathbf{E}e^{ivg(X_{m+1})}\mathbb I_{A_{m+1}} = \mathbf{E}e^{ivg(X_{m+1})}+r_1+r_2, \\&r_1=q_N^{-1}\mathbf{E}e^{ivg(X_{m+1})}\bigl (\mathbb I_{A_{m+1}}-1\bigr ), \qquad r_2=(q_N^{-1}-1)\mathbf{E}e^{ivg(X_{m+1})}. \end{aligned}$$

It follows from (44), (45) that, for every $v\in \mathbb R$,

$$\begin{aligned}&|r_1|\le q_N^{-1}\mathbf{E}|\mathbb I_{A_{m+1}}-1| =q_N^{-1}(1-\mathbf{P}\{A_{m+1}\}) =q_N^{-1}-1\le c_*N^{-2}, \\&|r_2|\le q_N^{-1}-1\le c_*N^{-2}. \end{aligned}$$

These bounds imply

$$\begin{aligned} |\mathbf{E}e^{ivg(Y_{m+1})}-\mathbf{E}e^{ivg(X_{m+1})}|\le c_*N^{-2}, \qquad {\text { for every}} \qquad v\in \mathbb R. \end{aligned}$$

(194)

One can show that, for sufficiently large N (i.e., for $N>C_*$), we have

$$\begin{aligned} |{\tilde{\beta }}_3/\beta _3-1|<1/5, \qquad |s^2/\sigma ^2-1|<1/5, \qquad |s^2-1|\le 1/5. \end{aligned}$$

(195)

Using (194), (195) we get, for $N>C_*$,

$$\begin{aligned} {\tilde{\delta }}&\ge 1 - \sup \bigl \{ |\mathbf{E}e^{iu\sigma ^{-1}g(Y_{m+1})}|:\, (2\beta _3)^{-1}\le |u|\le N^{(1+50\nu )/2} \bigr \} \\&\ge 1- \sup \bigl \{ |\mathbf{E}e^{iu\sigma ^{-1}g(X_{m+1})}|:\, (2\beta _3)^{-1}\le |u|\le N^{(\nu _2+1)/2} \bigr \}-c_*N^{-2} \\&\ge \delta ''/2. \end{aligned}$$

We obtain $|\gamma (t)|\le 1-{\tilde{\delta }}\le 1-\delta ''/2$ and, therefore, $|\gamma (t)|\le e^{-\delta ''/2}$. The lemma is proved.$\square $

Appendix 3

The main results of this section are moment inequalities of Lemma 9 and corresponding inequalities for conditional moments of Lemma 10. Lemma 8 provides an auxiliary inequality.

We start with some notation. We call $v=v(\cdot ),u=u(\cdot )\in L^r$ orthogonal if $\langle u,v \rangle =0$, where

$$\begin{aligned} \langle u,v \rangle =\int _\mathcal{X}u(x)v(x)P_X(dx)=\mathbf{E}u(X_1)v(X_1). \end{aligned}$$

Given $f\in L^2(P_X)$ we have for the kernel $\psi ^{**}$ defined in (41)

$$\begin{aligned} \mathbf{E}\psi ^{**}(X_1,X_2)\bigl ( f(X_1)g(X_2)+f(X_2)g(X_1)\bigr )=0 \end{aligned}$$

and almost surely

$$\begin{aligned}&\mathbf{E}(\psi ^{**}(X_1,X_2)|X_1)=0, \end{aligned}$$

(196)

$$\begin{aligned}&\mathbf{E}\bigl (\psi ^{**}(X_1,X_2)g(X_1)|X_2\bigr ) = 0. \end{aligned}$$

(197)

The latter identity says that almost all values of the $L^r$-valued random variable $\psi ^{**}(\cdot ,X_2)$ are orthogonal to the vector $g(\cdot )\in L^r$.

Let $ p_g:L^r\rightarrow L^r $ denote the projection on the subspace of elements $u\in L^r$ which are orthogonal to $g=g(\cdot )$. For $v\in L^r$, write $v^*=p_g(v)$. It follows from (197) that

$$\begin{aligned} \psi ^*(\cdot , Y_j) \, \Bigl (=p_g\bigl ( \psi (\cdot , Y_j)\bigr )\Bigr ) = \psi ^{**}(\cdot , Y_j)+g(Y_j) b^*(\cdot ), \end{aligned}$$

(198)

where $ b^*(\cdot )=p_g(b(\cdot ))= \sigma ^{-2}p_g\bigl (\, \mathbf{E}(\psi (\cdot ,X_1)g(X_1)\, \bigr )$. Denote

$$\begin{aligned} U_k^* \, \bigl (= p_g(U_k) \bigr )=\frac{1}{\sqrt{N}}\sum _{j\in O_k}\psi ^*(\cdot , Y_j), \qquad U_k^{**}=N^{-1/2}\sum _{j\in O_k}\psi ^{**}(\cdot ,Y_j), \end{aligned}$$

where the $L^r$-valued random variables $U_k$ are introduced in (91). For the random variables $g_k$ and $L_k$ introduced in (72) and (74), we have

$$\begin{aligned} U_k^* = U_k^{**}+L_k\, b^*(\cdot ) = U_k^{**}+(g_k-\frac{1}{\sqrt{n}}\frac{\xi _k}{N})b^*(\cdot ). \end{aligned}$$

(199)

Denote $K=\mathbf{E}|\psi (X_1,X_2)|^r$ and $K_s=\mathbf{E}|\psi ^{**}(X_1,X_2)|^s$, $s\le r$.

Lemma 8

Let $4<r\le 5$. For $s\le r$, we have

$$\begin{aligned} K_s^{r/s} \le K_r\le c\,K\Bigl (1+\frac{\mathbf{E}|g(X_1)|^r}{\sigma ^r}\Bigr )^2. \end{aligned}$$

(200)

Proof

The first inequality of (200) is a consequence of Lyapunov’s inequality. Let us prove the second inequality. The inequality $|a+b+c|^r\le 3^r(|a|^r+|b|^r+|c|^r)$ implies

$$\begin{aligned} K_r = \mathbf{E}|\psi ^{**}(X_1,X_2)|^r \le 3^r\bigl (K+2\mathbf{E}|b(X_1)|^r\mathbf{E}|g(X_2)|^r\bigr ). \end{aligned}$$

Therefore, (200) is a consequence of the inequalities

$$\begin{aligned}&\mathbf{E}|b(X_1)|^r\le \frac{2^r}{\sigma ^r}K+\frac{|\kappa |^r}{\sigma ^{4r}}\mathbf{E}|g(X_1)|^r, \\&\kappa ^2\le \sigma ^4\mathbf{E}\psi ^2(X_1,X_2)\le \sigma ^4K^{2/r}. \end{aligned}$$

Here $\kappa =\mathbf{E}\psi (X_1,X_2)g(X_1)g(X_2)$. To prove the first inequality use $|a+b|^r\le 2^r(|a|^r+|b|^r)$ to get

$$\begin{aligned} \mathbf{E}|b(X_1)|^r\le \frac{2^r}{\sigma ^{2r}}\mathbf{E}\bigl |\mathbf{E}\bigl (\psi (X_1,X_2)g(X_2)\bigr | X_1\bigr )\bigr |^r + \frac{\kappa ^r}{\sigma ^{4r}}\mathbf{E}|g(X_1)|^r. \end{aligned}$$

Furthermore, by Cauchy–Schwartz,

$$\begin{aligned} \bigl |\mathbf{E}\bigl (\psi (X_1,X_2)g(X_2)\,| \, X_1\bigr )\bigr | \le \bigl (\mathbf{E}(\psi ^2(X_1,X_2)\,|\,X_1)\bigr )^{1/2}\sigma . \end{aligned}$$

Finally, Lyapunov’s inequality implies

$$\begin{aligned} \bigl (\mathbf{E}(\psi ^2(X_1,X_2)\,|\,X_1)\bigr )^{r/2} \le \mathbf{E}(|\psi (X_1,X_2)|^r\,|\,X_1). \end{aligned}$$

We obtain $\mathbf{E}\bigl |\mathbf{E}\bigl (\psi (X_1,X_2)g(X_2)\,| \,X_1\bigr )\bigr |^r\le K\sigma ^r$ thus completing the proof.$\square $

Lemma 9

Let $1\le k\le n-1$. For ${\overline{U}}_k^*$, an independent copy of $U_k^*$, we have

$$\begin{aligned}&2\delta _3^2-\frac{c_*}{N^{\alpha (r-4)}} \le \frac{N}{M} \mathbf{E}\Vert U_k^*-{\overline{U}}_k^*\Vert _2^2 \le 2\delta _3^2+c_*, \end{aligned}$$

(201)

$$\begin{aligned}&\mathbf{E}\Vert U_k-{\overline{U}}_k\Vert _r^r\le c_*\Bigl (\frac{M}{N}\Bigr )^{r/2}. \end{aligned}$$

(202)

Recall that $\delta _3^2=\mathbf{E}|\psi ^{**}(X_1,X_2)|^2$.

Proof

Let us prove (201). By symmetry, we have, for $i,j\in O_1$,

$$\begin{aligned}&\mathbf{E}\Vert U_1^*-{\overline{U}}_1^*\Vert _2^2 = 2\frac{M}{N}H_1-2\frac{M}{N}H_2, \\&H_1 := \mathbf{E}\Vert \psi ^*(\cdot , Y_j)\Vert _2^2, \qquad H_2 := \mathbf{E}\langle \psi ^*(\cdot , Y_j),\psi ^*(\cdot , Y_i)\rangle , \quad i\not =j. \end{aligned}$$

The inequality (201) follows from the inequalities

$$\begin{aligned}&\delta _3^2-c_*N^{-\alpha (r-4)} \le H_1 \le \delta _3^2+c_*, \end{aligned}$$

(203)

$$\begin{aligned}&H_2 \le c_*N^{-\alpha (r-2)}. \end{aligned}$$

(204)

Let us prove (203). From (198) we have $H_1=V_1+V_2+2V_3$, where

$$\begin{aligned} V_1= \mathbf{E}\Vert \psi ^{**}(\cdot , Y_j)\Vert _2^2, \quad V_2=\Vert b^*(\cdot )\Vert _2^2\mathbf{E}g^2(Y_j), \quad V_3 = \mathbf{E}g(Y_j)\langle \psi ^{**}(\cdot ,Y_j), b^*(\cdot )\rangle . \end{aligned}$$

Let us show that

$$\begin{aligned} \delta _3^2-c_*N^{-\alpha (r-2)}\le V_1\le \delta _3^2+c_*N^{-\alpha \,r}. \end{aligned}$$

(205)

This inequality follows from (44), (45) and the identity

$$\begin{aligned} V_1 = q_{N}^{-1}\mathbf{E}|\psi ^{**}(X_1,X_j)|^2{\mathbb I}_{A_j} = q_{N}^{-1}\mathbf{E}|\psi ^{**}(X_1,X_j)|^2-q_{N}^{-1}V_1', \end{aligned}$$

where $V_1'=\mathbf{E}|\psi ^{**}(X_1,X_j)|^2(1-{\mathbb I}_{A_j})$ satisfies, by (43),

$$\begin{aligned} 0 \le V_1' \le N^{-\alpha (r-2)}\mathbf{E}|\psi ^{**}(X_1,X_j)|^2 \Vert Z_j'\Vert _r^{r-2} \le c_*N^{-\alpha (r-2)}. \end{aligned}$$

(206)

In the last step we applied Hölder’s inequality and Lemma 8 to get

$$\begin{aligned} \mathbf{E}|\psi ^{**}(X_1,X_j)|^2 \Vert Z_j'\Vert _r^{r-2} \le K_r^{2/r}(\mathbf{E}\Vert Z_j'\Vert _r^r)^{(r-2)/r} \le K_r^{2/r}K^{(r-2)/r} \le c_*. \end{aligned}$$

Let us show that

$$\begin{aligned} 0\le V_2\le c_*. \end{aligned}$$

(207)

For ${\tilde{b}}(\cdot ):= \mathbf{E}\psi (\cdot ,X_1)g(X_1)$ we have, by Cauchy–Schwartz,

$$\begin{aligned} \Vert {\tilde{b}}(\cdot )\Vert _2^2 = \mathbf{E}\bigl (\mathbf{E}(\psi (X,X_1)g(X_1)\bigl |\, X)\bigr )^2 \le \mathbf{E}\psi ^2(X,X_1)\, \sigma ^2 \le c_*\sigma ^2. \end{aligned}$$

Now the identity $b^*=\sigma ^{-2}p_g({\tilde{b}})$ implies

$$\begin{aligned} \Vert b^*\Vert _2^2 \le \sigma ^{-4}\Vert {\tilde{b}}\Vert _2^2 \le \sigma ^{-2}c_*. \end{aligned}$$

(208)

Invoking the bound $\mathbf{E}g^2(Y_j)\le c_*\sigma ^2$, see (47), we obtain (207).

Finally, write

$$\begin{aligned} V_3 = q_N^{-1}\mathbf{E}{\tilde{V}}{\mathbb I}_{A_j}, \qquad {\tilde{V}}=g(X_j)\psi ^{**}(X_1,X_j)b^{*}(X_1). \end{aligned}$$

Identity (197) implies $\mathbf{E}{\tilde{V}}=0$. Therefore $V_3=q_N^{-1}\mathbf{E}{\tilde{V}}({\mathbb I}_{A_j}-1)$. Invoking (43) and using $q_N^{-1}\le c_*$, see (45), we obtain

$$\begin{aligned} |V_3| \le c_*N^{-\alpha (r-4)}\mathbf{E}|{\tilde{V}}|\Vert Z_j'\Vert _r^{r-4} \le c_*N^{-\alpha (r-4)}. \end{aligned}$$

(209)

In the last step we used the bound $\mathbf{E}|{\tilde{V}}|\Vert Z_j'\Vert _r^{r-4}\le c_*$. In order to prove this bound we invoke the inequalities

$$\begin{aligned} |abc| \le (ab)^2+c^2 \le a^4+b^4+c^2 \end{aligned}$$

to show that

$$\begin{aligned} |{\tilde{V}}| \le |g(X_j)|^4+|\psi ^{**}(X_1,X_j)|^4+|b^*(X_1)|^2. \end{aligned}$$

Furthermore, by Hölder’s inequality and (200),

$$\begin{aligned} \mathbf{E}|g(X_j)|^4\Vert Z_j'\Vert _r^{r-4}\le c_*, \qquad \mathbf{E}|\psi ^{**}(X_1,X_j)|^4\Vert Z_j'\Vert _r^{r-4}\le c_*. \end{aligned}$$

By the independence and (208),

$$\begin{aligned} \mathbf{E}|b^*(X_1)|^2\Vert Z_j'\Vert _r^{r-4} = \Vert b^*\Vert _2^2\mathbf{E}\Vert Z_j'\Vert _r^{r-2}\le c_*. \end{aligned}$$

Thus we arrive at (209). Combining (205), (207) and (209) we obtain (203).

Let us prove (204). Using (198) write $H_2=Q_1+Q_2+2Q_3$, where

$$\begin{aligned}&Q_1=\mathbf{E}\psi ^{**}(X_1,Y_j)\psi ^{**}(X_1,Y_i), \qquad Q_2=\Vert b^*\Vert _2^2\mathbf{E}g(Y_j)g(Y_i), \\&Q_3=\mathbf{E}\psi ^{**}(X_1,Y_j)g(Y_i)b^*(X_1). \end{aligned}$$

It follows from the identity (196) that

$$\begin{aligned} Q_1 = q_N^{-2} \mathbf{E}\psi ^{**}(X_1,X_j)\psi ^{**}(X_1,X_i) ({\mathbb I}_{A_j}-1)({\mathbb I}_{A_i}-1). \end{aligned}$$

The simple inequality $|\psi ^{**}(X_1,X_j)\psi ^{**}(X_1,X_i)|\le |\psi ^{**}(X_1,X_j)|^2+|\psi ^{**}(X_1,X_i)|^2$ yields, by symmetry,

$$\begin{aligned} |Q_1| \le 2q_N^{-2}\mathbf{E}|\psi ^{**}(X_1,X_j)|^2(1-{\mathbb I}_{A_j}) \le c_*N^{-\alpha (r-2)}. \end{aligned}$$

(210)

In the last step we applied (206) and $q_N^{-1}\le c_*$, see (44).

Furthermore, using the identity $\mathbf{E}g(X_i)=0$ we obtain from (43)

$$\begin{aligned} |\mathbf{E}g(Y_i)|&= q_N^{-1}|\mathbf{E}g(X_i)({\mathbb I}_{A_i}-1)| \nonumber \\&\le q_{N}^{-1}N^{-\alpha (r-1)} \mathbf{E}|g(X_i)|\Vert Z_i\Vert _r^{r-1}\le c_*N^{-\alpha (r-1)}. \end{aligned}$$

(211)

In the last step we applied Hölder’s inequality to show $\mathbf{E}|g(X_i)|\Vert Z_i\Vert _r^{r-1}\le c_*$.

The bounds (211), (44) and (208) together imply

$$\begin{aligned} |Q_k|\le c_*N^{-\alpha (r-1)}, \qquad k=2,3. \end{aligned}$$

(212)

The bound (204) follows from (210) and (212).

Let us prove (202). For this purpose we shall show that

$$\begin{aligned} \mathbf{E}\big \Vert \sum _{j\in O_k}\frac{V_j}{\sqrt{M}}\big \Vert _r^r\le c_*, \qquad {\text {where}} \qquad V_j=\psi (\cdot ,Y_j)-\psi (\cdot ,{\overline{Y}}_j), \end{aligned}$$

(213)

and where ${\overline{Y}}_j$ denote independent copies of $Y_j$, $j\in O_k$. Using

$$\begin{aligned} \mathbf{E}\Vert \psi (\cdot ,X_j)\Vert _r^r=\mathbf{E}|\psi (X_1,X_j)|^r\le c_* \end{aligned}$$

we obtain, by symmetry and (47),

$$\begin{aligned} \mathbf{E}\Vert V_j\Vert _r^r \le 2^r\mathbf{E}\Vert \psi (\cdot ,Y_j)\Vert _r^r \le c_*\mathbf{E}\Vert \psi (\cdot , X_j)\Vert _r^r \le c_*. \end{aligned}$$

Now (213) follows from the well known inequality

$$\begin{aligned} \Vert \xi _1+\dots +\xi _k\Vert _r^r\le c(r)\sum _{i=1}^k\mathbf{E}\Vert \xi _i\Vert _r^r+c(r)\left( \sum _{i=1}^k\mathbf{E}\Vert \xi _i\Vert _r^2\right) ^{r/2}, \quad k=1,2,\dots \end{aligned}$$

(214)

which is valid for independent centered random elements $\xi _i$ with values in $L^r$. One can derive this inequality from Hoffmann–Jorgensen’s inequality (see e.g., Proposition 6.8 in Ledoux and Talagrand [32]) using the type 2 property of the Banach space $L^r$ and the symmetrization lemma (see formula (9.8) and Lemma 6.3 ibidem). The proof of the lemma is complete.$\square $

Before formulating and proving Lemma 10 we introduce some more notation. Let $\mathcal{B}(L^r)$ denote the class of Borel sets of $L^r$. Consider the regular conditional probability $P_k:{\mathbb R}\times \mathcal{B}(L^r)\rightarrow [0,1]$, defined, for $z_k\in {\mathbb R}$ and $B\in \mathcal{B}(L^r)$,

$$\begin{aligned} P_k(z_k;B):=\mathbf{P}\bigl (U_k\in B\,\bigr |g_k=z_k\bigr )=\mathbf{E}(\mathbb I_{U_k\in B}|g_k=z_k). \end{aligned}$$

Recall, see (82), that $\psi _k$ denotes a $L^r$ valued random variable with the distribution $\mathbf{P}\{\psi _k\in B\}=P_k(z_k;B)$. Note that the $L^r$ valued random variable $\psi _k^*=p_g(\psi _k)$ has distribution

$$\begin{aligned} \mathbf{P}\{\psi _k^*\in B\}&= \mathbf{P}\{p_g(\psi _k)\in B\} = \mathbf{P}\{\psi _k\in p_g^{-1}(B)\} \nonumber \\&= \mathbf{P}\bigl (U_k\in p_g^{-1}(B)\,\bigr |g_k=z_k\bigr ) = \mathbf{P}\bigl (U_k^*\in B\,\bigr |\,g_k=z_k\bigr ). \end{aligned}$$

(215)

Furthermore, using (199) we write (215) in the form

$$\begin{aligned} \mathbf{P}\{\psi _k^*\in B\} = \mathbf{P}\left( U_k^{**} + \left( z_k-\frac{1}{N}\frac{\xi _k}{n^{1/2}}\right) \,b^* \in B \, \Bigr | \, g_k=z_k \right) . \end{aligned}$$

Let ${\overline{\psi }}_k$ respectively ${\overline{\psi }}^*_k$ denote an independent copy of $\psi _k$ respectively $\psi ^*_k$. Denote

$$\begin{aligned}\tau _N=M^{-(r-4)/2}+N^{-\alpha (r-2)}M. \end{aligned}$$

Lemma 10

Let $k=1,\dots , n-1$. Let $|z_k|\le w\,n^{-1/2}$. There exist positive constants $c_*^{(i)}$, $i=0,1,2,3$, which depend on $w, r,\nu _1,\nu _2, \delta , A_*,D_*,M_*$ only such that for

$$\begin{aligned} \tau _N\le c_*^{(0)}\delta _3^2, \end{aligned}$$

(216)

we have

$$\begin{aligned}&c_*^{(1)}\delta _3^2 \le n\mathbf{E}\Vert \psi _k^*-{\overline{\psi }}_k^*\Vert _2^2 \le c_*^{(2)}\delta _3^2 \end{aligned}$$

(217)

$$\begin{aligned}&\mathbf{E}\Vert \psi _k-{\overline{\psi }}_k\Vert _r^r \le c_*^{(3)}n^{-r/2}. \end{aligned}$$

(218)

Condition (216) requires N to be large enough. A simple calculation shows $\tau _N\le N^{-75\nu }$, for $\nu $ satisfying (17). Therefore, (87) implies $\tau _N\le N^{-65\nu }\delta _3^2$. In particular, under (87) the inequality (216) is satisfied provided that $N>c_*$, where $c_*$ does not depend on $\delta _3^2$.

Proof

By ${\tilde{c}}_*, {\tilde{c}}_*'$ we denote positive constants which depend only on $w, r,\nu _1,\nu _2, \delta , A_*, D_*,M_*$. These constants can be different in different places of the text. Given $i,j\in O_k$, $i\not = j$, introduce random variables

$$\begin{aligned}&g_*=\eta +\zeta , \qquad \eta =\frac{\xi _k}{R}, \quad \zeta =\frac{1}{\sqrt{M}}\sum _{j\in O_k}g(Y_j), \\&\zeta _i=\zeta -\frac{g(Y_i)}{\sqrt{M}}, \qquad \zeta _{ij}=\zeta -\frac{g(Y_i)}{\sqrt{M}}-\frac{g(Y_j)}{\sqrt{M}}. \end{aligned}$$

Here $R=\sqrt{n\,M\,N}$ satisfies $N/2\le R\le N$, by the choice of n and M. Let p, $p_0$, $p_1$, and $p_2$ denote the densities of random variables $\eta $, $\zeta +\eta $, $\zeta _i+\eta $, and $\zeta _{ij}+\eta $ respectively.

Note that $g_*=\sqrt{N/M}g_k$. Therefore, the condition $g_k=z_k$ is equivalent to $g_*=z_*$, where $z_*=\sqrt{N/M}z_k$. Furthermore, $|z_k|\le w\, n^{-1/2}\Leftrightarrow |z_*|\le w_*$, where $w_*=w\sqrt{N/Mn}\le 2w$.

Given a random variable Y, we denote the conditional expectation $\mathbf{E}(Y|g_*=z_*)=\mathbf{E}(Y|g_k=z_k)$ by $\mathbf{E}_*Y$. For an event A, we have $P(A|g_k=z_k)=P(A|g_*=z_*)$.

Proof of (217). For the $L^r$ valued random variable ${\hat{\psi }}^*=\psi _k^*-z_kb^*$ we have

$$\begin{aligned} \mathbf{P}\{{\hat{\psi }}^*\in B\}= P \Bigl ( U_k^{**} - \frac{1}{N}\frac{\xi _k}{n^{1/2}}\,b^* \in B \, \Bigr | \, g_*=z_* \Bigr ). \end{aligned}$$

(219)

Note that for an independent copy ${\overline{\psi }}^*_k$ of $\psi ^*_k$ the distributions of $\psi _k^*-{\overline{\psi }}^*_k$ and ${\hat{\psi }}^*-{\hat{\psi }}^*_c$ are the same. Here ${\hat{\psi }}^*_c$ denotes an independent copy of ${\hat{\psi }}^*$. Therefore,

$$\begin{aligned} \mathbf{E}\Vert \psi ^*_k-{\overline{\psi }}^*_k\Vert _2^2 = \mathbf{E}\Vert {\hat{\psi }}^*-{\hat{\psi }}^*_c\Vert _2^2 = 2\mathbf{E}\Vert {\hat{\psi }}^*\Vert _2^2-2\Vert \mathbf{E}{\hat{\psi }}^*\Vert _2^2. \end{aligned}$$

(220)

In order to prove (217) we show that

$$\begin{aligned} \Vert \mathbf{E}{\hat{\psi }}^*\Vert _2^2 \le {\tilde{c}}_* N^{-1} \end{aligned}$$

(221)

and, for $\tau _N\le c_*^{(0)}\delta _3^2$ (i.e., for sufficiently large N),

$$\begin{aligned} {\tilde{c}}_*\delta _3^2 \le n\mathbf{E}\Vert {\hat{\psi }}^*\Vert _2^2 \le {\tilde{c}}_*'\delta _3^2. \end{aligned}$$

(222)

Since $N^{-1}n<\tau _N$, we can choose $c_*^{(0)}$ small enough such that the inequalities (220), (221) and (222) together imply (217)

Proof of (221). Recall that an element $m=m(\cdot )\in L^2(P_X)$ is called mean of an $L^2(P_X)$ valued random variable ${\hat{\psi }}^*={\hat{\psi }}^*(\cdot )$ if for every $f=f(\cdot )\in L^2(P_X)$

$$\begin{aligned} \left\langle f,m \right\rangle =\mathbf{E}\left\langle f, \hat{\psi }^*\right\rangle . \end{aligned}$$

We shall show below that $\mathbf{E}\Vert {\hat{\psi }}^*\Vert _2^2<\infty $. Then, by Fubini,

$$\begin{aligned} \mathbf{E}\left<f,{\hat{\psi }}^*\right> = \int f(x)\mathbf{E}\,{\hat{\psi }}^*(x)P_X(dx). \end{aligned}$$

Therefore, $m(x)=\mathbf{E}{\hat{\psi }}^*(x)$, for $P_X$ almost all x.

For $f\in L^2(P_X)$ it follows from (219) that

$$\begin{aligned} \mathbf{E}\left<f,{\hat{\psi }}^*\right>&=\mathbf{E}_*\left<f, U_k^{**} - \frac{1}{N}\frac{\xi _k}{n^{1/2}}\,b^*\right> \nonumber \\&= \mathbf{E}_*\left<f,U_k^{**}\right> - \frac{\sqrt{M}}{\sqrt{N}} \left<f,b^*\right> \mathbf{E}_*\eta . \end{aligned}$$

(223)

Fix $i\in O_k$. By symmetry,

$$\begin{aligned} \mathbf{E}_*\left<f,U_k^{**}\right> = \frac{M}{\sqrt{N}}\mathbf{E}_*\left<f,\psi ^{**}(\cdot , Y_i)\right>. \end{aligned}$$

(224)

An application of (252) yields

$$\begin{aligned} \mathbf{E}_*\left<f,\psi ^{**}(\cdot ,Y_i)\right>&= \frac{1}{p_0(z_*)}\mathbf{E}\left<f,\psi ^{**}(\cdot ,Y_i)\right> p_1\bigl ( z_*-\frac{g(Y_i)}{\sqrt{M}} \bigr ) \nonumber \\&= \left<f,a_{z_*}\right>, \end{aligned}$$

(225)

where

$$\begin{aligned} a_{z_*}(\cdot )=\frac{b_{z_*}(\cdot )}{p_0(z_*)}, \qquad b_{z_*}(\cdot ) = \mathbf{E}\psi ^{**}(\cdot ,Y_i) p_1\bigl (z_*-\frac{g(Y_i)}{\sqrt{M}}\bigr ) \end{aligned}$$

are non-random elements of $L^r$. It follows from (223), (224), (225) that

$$\begin{aligned} m(\cdot ) = \frac{M}{\sqrt{N}}a_{z_*}(\cdot ) - \frac{\sqrt{M}}{\sqrt{N}}b^*(\cdot )\,\mathbf{E}_*\eta . \end{aligned}$$

In order to prove (221) we show that, for $|z_*|\le w_*$,

$$\begin{aligned}&\Vert b_{z_*}\Vert _2 \le c_*M^{-1}, \end{aligned}$$

(226)

$$\begin{aligned}&|\mathbf{E}_* \eta | \le {\tilde{c}}_*M^{-1/2}+{\tilde{c}}_* R^{-1/2}, \end{aligned}$$

(227)

$$\begin{aligned}&p_i(z_*)\ge {\tilde{c}}_*, \qquad i=0,1,2, \end{aligned}$$

(228)

and apply (208). Note that, by Lemma 7, there exist positive constants ${\tilde{c}}_*, {\tilde{c}}_*'$ such that, for $M,N>{\tilde{c}}_*'$, the inequality (228) holds.

Let us prove (226). In Lemma 7 we show, for $i=1,2$, that $p_i$ and its derivatives are bounded functions. That is,

$$\begin{aligned} |p_i|\le c_*, \qquad |p_i'|\le c_*, \qquad |p_i''|\le c_*, \qquad |p_i'''|\le c_*, \quad i=1,2. \end{aligned}$$

(229)

Expanding in powers of $M^{-1/2}g(Y_i)$ we obtain

$$\begin{aligned} p_1\bigl (z_*-\frac{g(Y_i)}{\sqrt{M}}\bigr ) = p_1(z_*)-\frac{g(Y_i)}{\sqrt{M}}p_1'(z_*)+\frac{g^2(Y_i)}{M}\frac{p_1''(\theta )}{2}. \end{aligned}$$

(230)

It follows from the identities (196) and (197) that for $P_X$ almost all x

$$\begin{aligned} \mathbf{E}\psi ^{**}(x,Y_i)&= q_N^{-1} \mathbf{E}\psi ^{**}(x,X_i)\mathbb I_{A_i} \\&= q_N^{-1}\mathbf{E}\psi ^{**}(x,X_i)(\mathbb I_{A_i}-1) \\&=:q_N^{-1}a_0(x), \\ \mathbf{E}\psi ^{**}(x,Y_i)g(Y_i)&= q_N^{-1} \mathbf{E}\psi ^{**}(x,X_i)g(X_i)\mathbb I_{A_i} \\&= q_N^{-1}\mathbf{E}\psi ^{**}(x,X_i)g(X_i)(\mathbb I_{A_i}-1) \\&=:q_N^{-1}a_1(x). \end{aligned}$$

Using (229) and the inequality $q_N^{-1}\le c_*$, see (44), we obtain from (230)

$$\begin{aligned} \Vert b_{z_*}(\cdot )\Vert _2 \le c_*\Vert a_0(\cdot )\Vert _2 + \frac{c_*}{\sqrt{M}}\Vert a_1(\cdot )\Vert _2 + \frac{c_*}{M}\Vert a_2(\cdot )\Vert _2, \end{aligned}$$

where we denote $a_2(\cdot )=\mathbf{E}\psi ^{**}(\cdot ,Y_i)g^2(Y_i)$. In order to prove (226) we show that

$$\begin{aligned} \Vert a_0(\cdot )\Vert _2\le \frac{c_*}{N^{\alpha (r-1)}}, \qquad \Vert a_1(\cdot )\Vert _2\le \frac{c_*}{N^{\alpha (r-2)}}, \qquad \Vert a_2(\cdot )\Vert _2\le c_*. \end{aligned}$$

(231)

Let us prove (231). Invoking (43) we obtain, by Hölder’s inequality,

$$\begin{aligned} |a_0(x)|\le \mathbf{E}|\psi ^{**}(x,X_i)|\frac{\Vert Z_i'\Vert _r^{r-1}}{N^{\alpha (r-1)}} \le w^{1/r}(x)\frac{K^{(r-1)/r}}{N^{\alpha (r-1)}}, \end{aligned}$$

(232)

where we denote $w(x)=\mathbf{E}|\psi ^{**}(x,X_i)|^r$. Furthermore, by Lyapunov’s inequality,

$$\begin{aligned} \Vert w^{1/r}(\cdot )\Vert _2^2=\mathbf{E}w^{2/r}(X)\le \bigl (\mathbf{E}w(X)\bigr )^{2/r}=K_r^{2/r}. \end{aligned}$$

(233)

Clearly, the first bound of (231) follows from (232), (233) and (200). A similar argument shows the second bound of (231). We have

$$\begin{aligned} |a_1(x)| \le \mathbf{E}|\psi ^{**}(x,X_i)g(X_i)|\frac{\Vert Z_i'\Vert _r^{r-2}}{N^{\alpha (r-2)}} \le w^{1/r}(x) \frac{V^{(r-1)/r}}{N^{\alpha (r-2)}}, \end{aligned}$$

(234)

where we denote $V=\mathbf{E}\bigl ( \Vert Z_i'\Vert _r^{r-2} |g(X_i)|\bigr )^{r/(r-1)}$. By Hölder’s inequality,

$$\begin{aligned} V \le \bigl (\mathbf{E}|g(X_i)|^r\bigl )^{1/(r-1)}\bigl (\mathbf{E}\Vert Z_i'\Vert _r^r\bigr )^{(r-2)/(r-1)} \le c_*. \end{aligned}$$

(235)

Clearly, (233), (234) and (235) imply the second bound of (231). The last bound of (231) follows from (47), by Cauchy-Shwartz. Indeed, we have

$$\begin{aligned} |a_2(x)| \le c_*\mathbf{E}|\psi ^{**}(x,X_i)|g^2(X_i) \le c_* \bigl (\mathbf{E}|\psi ^{**}(x,X_i)|^2\mathbf{E}g^4(X_i)\bigr )^{1/2}. \end{aligned}$$

Therefore, $\Vert a_2(\cdot )\Vert _2^2\le c_*K_2\mathbf{E}g^4(X_i)\le c_*$, by (200).

Let us prove (227). We have, by (251),

$$\begin{aligned} \mathbf{E}_*\eta =p_0^{-1}(z_*)\mathbf{E}(z_*-\zeta )p(z_*-\zeta ). \end{aligned}$$

In order to prove (227) it suffices to show in view of (228) that

$$\begin{aligned} |\mathbf{E}(z_*-\zeta )p(z_*-\zeta )| \le c_*R^{-1/2}+ c_*M^{-1/2}. \end{aligned}$$

(236)

Let ${\tilde{p}}$ denote the density function of $\xi _k$. Then $p(u)=R\,{\tilde{p}}(R\, u)$. We have

$$\begin{aligned} \mathbf{E}(z_*-\zeta )p(z_*-\zeta ) = 6 c_{\xi }\mathbf{E}\frac{\sin ^6 (R(z_*-\zeta )/6)}{\bigl (R\,(z_*-\zeta )/6\bigr )^5}. \end{aligned}$$

Therefore, denoting $H(z_*)=1+|R\,(z_*-\zeta )|^5$, we obtain

$$\begin{aligned} \mathbf{E}|(z_*-\zeta )p(z_*-\zeta )| \le c \mathbf{E}H^{-1}(z_*). \end{aligned}$$

(237)

On the event $|\zeta -z_*|\ge R^{-1/2}$ we have $ H^{-1}(z_*) \le R^{-5/2}$. Furthermore, a bound for the probability of the complementary event

$$\begin{aligned} \mathbf{P}\{|\zeta -z_*|\le R^{-1/2}\}\le c_*R^{-1/2}+c_*M^{-1/2}, \end{aligned}$$

follows by the Berry–Esseen bound applied to the sum $\zeta $. Therefore, $\mathbf{E}H^{-1}(z_*)$ is bounded by the right side of (236). Now (236) follows from (237).

Proof of (222). Write

$$\begin{aligned}&U_k^{**}-\frac{1}{N}\frac{\xi _k}{\sqrt{n}}b^*=\frac{\sqrt{M}}{\sqrt{N}}(T_1-T_2), \\&T_1:= \frac{1}{\sqrt{M}}\sum _{j\in O_k}\psi ^{**}(\cdot ,Y_j), \qquad T_2:=\eta b^*. \end{aligned}$$

It follows from (219), by the inequality $\Vert u+v\Vert _2^2\ge \Vert u\Vert _2^2/2-\Vert v\Vert _2^2$, for $u,v\in L^2(P_X)$, that

$$\begin{aligned} \mathbf{E}\Vert {\hat{\psi }}^*\Vert _2^2&= \frac{M}{N} \mathbf{E}_*\Vert T_1-T_2\Vert _2^2 \ge \frac{M}{2N}\mathbf{E}_* \Vert T_1\Vert _2^2 - \frac{M}{N}\mathbf{E}_* \Vert T_2\Vert _2^2. \end{aligned}$$

We shall show that

$$\begin{aligned}&\mathbf{E}_* \Vert T_2\Vert _2^2 \le p_0^{-1}(z_*)\bigl (c_*R^{-1}M^{-1/2}+c_*R^{-3/2}\bigr ), \end{aligned}$$

(238)

$$\begin{aligned}&\mathbf{E}_* \Vert T_1\Vert _2^2 \ge p_0^{-1}(z_*)\bigl (p_1(z_*)\delta _3^2-c_*\tau _N\bigr ), \end{aligned}$$

(239)

$$\begin{aligned}&\mathbf{E}_* \Vert T_1\Vert _2^2 \le p_0^{-1}(z_*)\bigl (p_1(z_*)\delta _3^2+c_*\tau _N\bigr ). \end{aligned}$$

(240)

The inequalities (238) and (239) imply the lower bound in (222). Indeed, by (228), we have, for small $c_*^{(0)}$,

$$\begin{aligned} \frac{c_*}{M^{1/2}R}+\frac{c_*}{R^{3/2}} \le c_*\tau _N \le c_*c_*^{(0)}\delta _3^2 \le p_1(z_*)\delta _3^2/4. \end{aligned}$$

Similarly, the inequalities (238) and (240) imply the upper bound in (222).

Proof of (238). We have, by (251),

$$\begin{aligned} \mathbf{E}_* \eta ^2 = p_0^{-1}(z_*)W, \qquad W:=\mathbf{E}(z_*-\zeta )^2p(z_*-\zeta ). \end{aligned}$$

Proceeding as in proof of (236), we obtain

$$\begin{aligned} W = \frac{36}{R}c_{\xi }\mathbf{E}\frac{\sin ^6(R(z_*-\zeta )/6)}{(R(z_*-\zeta )/6)^{4}} \le \frac{c}{R}\mathbf{E}{\tilde{H}}^{-1}(z_*), \end{aligned}$$

where ${\tilde{H}}(z_*)=1+|R(z_*-\zeta )|^{4}$ satisfies

$$\begin{aligned} \mathbf{E}{\tilde{H}}^{-1}(z_*)\le c_*R^{-1/2}+c_*M^{-1/2}. \end{aligned}$$

Therefore, $ W \le c_*R^{-3/2}+c_*R^{-1}M^{-1/2}$. This inequality in combination with (208) implies (238).

Proof of (239). Fix $i,j\in O_k$, $i\not =j$. By symmetry,

$$\begin{aligned}&\mathbf{E}_* \Vert T_1\Vert _2^2 = \mathbf{E}_*T_{11}+(M-1)\mathbf{E}_*T_{12},\nonumber \\&T_{11}=\Vert \psi ^{**}(\cdot ,Y_i)\Vert _2^2, \qquad T_{12}=\left<\psi ^{**}(\cdot ,Y_i),\,\psi ^{**}(\cdot ,Y_j)\right>. \end{aligned}$$

(241)

We have, by (252),

$$\begin{aligned}&\mathbf{E}_*T_{11}= p_0^{-1}(z_*)H_1, \qquad \mathbf{E}_*T_{12}= p_0^{-1}(z_*)H_2, \qquad \\&H_1=\mathbf{E}T_{11}p_1\bigl (z_*-\frac{g(Y_i)}{\sqrt{M}}\bigr ), \qquad H_2=\mathbf{E}T_{12}p_2\bigl (z_*-\frac{g(Y_i)+g(Y_j)}{\sqrt{M}}\bigr ). \end{aligned}$$

The inequality (239) follows from (241) and the bounds

$$\begin{aligned}&H_1 \ge p_1(z_*)\delta _3^2-c_*M^{-1/2}, \end{aligned}$$

(242)

$$\begin{aligned}&|H_2| \le c_*N^{-\alpha (r-2)} + c_*M^{-(r-2)/2}. \quad \end{aligned}$$

(243)

Let us prove (242). It follows from (229), by the mean value theorem, that

$$\begin{aligned} H_1 = p_1(z_*)\mathbf{E}T_{11} + Q, \qquad |Q|\le c_*\mathbf{E}T_{11}\frac{|g(Y_i)|}{\sqrt{M}}, \end{aligned}$$

(244)

where $|Q|\le c_*M^{-1/2}$. Indeed, by (47) and Cauchy–Schwartz,

$$\begin{aligned} \mathbf{E}T_{11}|g(Y_i)| \le \mathbf{E}\Vert \psi ^{**}(\cdot ,X_i)\Vert _2^2|g(X_i)| \le K_4^{1/2}\sigma \le c_*. \end{aligned}$$

In the last step we applied (200). Furthermore, the identity

$$\begin{aligned}&\mathbf{E}T_{11} = q_N^{-1}\mathbf{E}|\psi ^{**}(X,X_i)|^2{\mathbb I}_{A_i} =\mathbf{E}|\psi ^{**}(X,X_i)|^2-b_1-b_2, \\&b_1=(1-q_N^{-1})\mathbf{E}|\psi ^{**}(X,X_i)|^2, \qquad b_2=q_N^{-1}\mathbf{E}|\psi ^{**}(X,X_i)|^2(1-{\mathbb I}_{A_i}) \end{aligned}$$

combined with (43), (44) and (45) yields $\mathbf{E}T_{11}\ge \delta _3^2-c_*M^{-1/2}$. This bound together with (244) shows (242).

Let us prove (243). Write $y_i=g(Y_i)$ and expand

$$\begin{aligned} p_2\bigl (z_*-\frac{y_i+y_j}{\sqrt{M}}\bigr ) = p_2(z_*)-p_2'(z_*)\frac{y_i+y_j}{\sqrt{M}}+\frac{p_2''(z_*)}{2}\frac{(y_i+y_j)^2}{M} + {\tilde{Q}}, \end{aligned}$$

where ${\tilde{Q}}$ denotes the remainder term. From (229) it follows, for $2<r-2\le 3$ that

$$\begin{aligned} |{\tilde{Q}}|\le c_*|y_i+y_j|^{r-2}/M^{(r-2)/2}. \end{aligned}$$

Furthermore, denote

$$\begin{aligned}&h_1=\mathbf{E}T_{12}, \qquad \qquad \quad h_2=\mathbf{E}T_{12}g(Y_i), \\&h_3=\mathbf{E}T_{12}g^2(Y_i), \qquad h_4=\mathbf{E}T_{12}g(Y_i)g(Y_j). \end{aligned}$$

We obtain, by symmetry,

$$\begin{aligned}&H_2=p_2(z_*)h_1-2\frac{p_2'(z_*)}{\sqrt{M}}h_2 +\frac{p_2''(z_*)}{M}(h_3+h_4)+ \mathbf{E}T_{12}{\tilde{Q}}, \nonumber \\&\mathbf{E}|T_{12}{\tilde{Q}}|\le c_*M^{-(r-2)/2}\mathbf{E}|g(Y_i)|^{r-2}|T_{12}|. \end{aligned}$$

(245)

Denote

$$\begin{aligned} {\tilde{T}}_{12}=q_N^{-2}\psi ^{**}(X,X_i)\psi ^{**}(X,X_j). \end{aligned}$$

It follows from (45), by Hölders inequality and (200), that

$$\begin{aligned} \mathbf{E}|g(Y_i)|^{r-2}|T_{12}| \le \mathbf{E}|g(X_i)|^{r-2}|{\tilde{T}}_{12}| \le c_*. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathbf{E}|T_{12}{\tilde{Q}}|\le c_*M^{-(r-2)/2}. \end{aligned}$$

(246)

Furthermore, (196) and (197) imply

$$\begin{aligned}&h_1=\mathbf{E}{\tilde{T}}_{12}(\mathbb I_{A_i}-1)(\mathbb I_{A_j}-1), \qquad h_2=\mathbf{E}{\tilde{T}}_{12}g(X_i)(\mathbb I_{A_i}-1)(\mathbb I_{A_j}-1), \\&h_3=\mathbf{E}{\tilde{T}}_{12}g^2(X_i)\mathbb I_{A_i}(\mathbb I_{A_j}-1), \qquad h_4=\mathbf{E}{\tilde{T}}_{12}g(X_i)g(X_j)(\mathbb I_{A_i}-1)(\mathbb I_{A_j}-1). \end{aligned}$$

Invoking the inequalities $q_N^{-2}\le c_*$, see (44), and $1-\mathbb I_{A_i}\le V_i^s$, $s>0$, where $V_i:= \Vert Z_i'\Vert _r/N^{\alpha }$, see (43), we obtain, by Hölder’s inequality,

$$\begin{aligned}&|h_1| \le \mathbf{E}|{\tilde{T}}_{12}|(V_iV_j)^{(r-2)/2} \le c_*N^{-\alpha (r-2)}, \nonumber \\&|h_2| \le \mathbf{E}|{\tilde{T}}_{12}g(X_i)|V_i^{(r-4)/2}V_j^{(r-2)/2} \le c_*N^{-\alpha (r-3)}, \nonumber \\&|h_3| \le \mathbf{E}|{\tilde{T}}_{12}|g^2(X_i)V_j^{r-4} \le c_*N^{-\alpha (r-4)},\nonumber \\&|h_4| \le \mathbf{E}|{\tilde{T}}_{12}g(X_i)g(X_j)|(V_iV_j)^{(r-4)/2} \le c_*N^{-\alpha (r-4)}. \end{aligned}$$

(247)

Combining (245), (247), (246) and using the simple inequalities

$$\begin{aligned} \frac{1}{N^{\alpha (r-3)}M^{1/2}}\le \frac{1}{{\tilde{N}}}, \qquad \frac{1}{N^{\alpha (r-4)}M}\le \frac{1}{{\tilde{N}}}, \qquad {\tilde{N}}=\min \{N^{\alpha (r-2)}, M^{(r-2)/2}\} \end{aligned}$$

and the inequalities (229), we obtain (243).

Proof of (240). The inequality follows from (241), (243) and the inequality

$$\begin{aligned} H_1&\le p_1(z_*)\mathbf{E}T_{11}+c_*M^{-1/2} \le p_1(z_*)\delta _3^2+c_*N^{-\alpha r}+c_*M^{-1/2} \\&\le p_1(z_*)\delta _3^2+c_*M^{-1/2}, \end{aligned}$$

which is obtained in the same way as (242) above.

Proof of (218). In order to prove (218) we shall show that

$$\begin{aligned} \mathbf{E}\Vert \psi _k\Vert _r^r\le {\tilde{c}}_*n^{-r/2}. \end{aligned}$$

(248)

Split $O_k=B\cup D$, where $B\cap D=\emptyset $ and $|B|=[M/2]$ and write

$$\begin{aligned}&U_k=\frac{\sqrt{M}}{\sqrt{N}}(U_B+U_D), \qquad U_B=\frac{1}{\sqrt{M}}\sum _{j\in B}\psi (\cdot ,Y_j), \\&\zeta =\zeta _B+\zeta _D, \qquad \zeta _B=\frac{1}{\sqrt{M}}\sum _{j\in B}g(Y_j). \end{aligned}$$

In particular, we have $g_*=\eta +\zeta _B+\zeta _D$.

The inequality

$$\begin{aligned} \mathbf{E}\Vert \psi _k\Vert _r^r = \mathbf{E}_*\Vert U_k\Vert _r^r \le 2^r\Bigl (\frac{M}{N}\Bigr )^{r/2}\bigl (\mathbf{E}_*\Vert U_B\Vert _r^r+\mathbf{E}_*\Vert U_D\Vert _r^r\bigr ) \end{aligned}$$

combined with the bounds

$$\begin{aligned} \mathbf{E}_*\Vert U_B\Vert _r^r\le c_*, \qquad \mathbf{E}_*\Vert U_D\Vert _r^r\le c_* \end{aligned}$$

(249)

imply (248). Let us prove the first bound of (249). By (252), we have

$$\begin{aligned} \mathbf{E}_*\Vert U_B\Vert _r^r=p_0^{-1}(z_*)\mathbf{E}\Vert U_B\Vert _r^rp_3(z_*-\zeta _B), \end{aligned}$$

where $p_3$ denotes the density of $\eta +\zeta _D$. Furthermore, invoking the bound $\sup _{x\in {\mathbb R}}|p_3(x)|\le c_*$, (which is obtained using the same argument as in the proof of Lemma 7) and the inequality (228), we obtain $\mathbf{E}_*\Vert U_B\Vert _r^r\le {\tilde{c}}_*\mathbf{E}\Vert U_B\Vert _r^r$. Finally, invoking the bound

$$\begin{aligned} \mathbf{E}\Vert U_B\Vert _r^r \le c_*\mathbf{E}\Vert \frac{1}{\sqrt{M}}\sum _{j\in B}\psi (\cdot , X_j)\Vert _r^r \le c_*, \end{aligned}$$

(250)

see (47) and (214), we obtain the first bound of (249). The second bound is obtained in the same way. This completes the proof of the lemma.$\square $

We collect some facts about conditional moments in a separate lemma.

Lemma 11

Let $\eta $ and $\zeta $ be independent random variables. Assume that $\eta $ is real valued and has a density, say $x\rightarrow p(x)$.

(i) Assume that $\zeta $ is real valued. Then the function

$$\begin{aligned} x\rightarrow \mathbf{E}p(x-\zeta ), \qquad x\in {\mathbb R}, \end{aligned}$$

is a density of the distribution $P_{\eta +\zeta }$ of $\eta +\zeta $. Let $w:{\mathbb R}\rightarrow {\mathbb R}$ be a measurable function such that $\mathbf{E}|w(\eta )|<\infty $. For $P_{\eta +\zeta }$ almost all $x\in {\mathbb R}$, we have

$$\begin{aligned} \mathbf{E}\bigl (w(\eta )\,\bigr |\, \eta +\zeta =x\bigr ) = \frac{\mathbf{E}w(x-\zeta )p(x-\zeta )}{\mathbf{E}p(x-\zeta )}. \end{aligned}$$

(251)

(ii) Assume that $\zeta $ takes values in a measurable space, say $\mathcal Y$. Assume that $u,v:\mathcal Y\rightarrow {\mathbb R}$ are measurable functions and denote $P_{\eta +u(\zeta )}$ the distribution of $\eta +u(\zeta )$. If $\mathbf{E}|v(\zeta )|<\infty $, then for $P_{\eta +u(\zeta )}$ almost all $x\in {\mathbb R}$,

$$\begin{aligned} \mathbf{E}\bigl (v(\zeta )\,\bigr |\, \eta +u(\zeta )=x\bigr ) = \frac{\mathbf{E}v(\zeta )p\bigl (x-u(\zeta )\bigr )}{\mathbf{E}p\bigl (x-u(\zeta )\bigr )}. \end{aligned}$$

(252)

Appendix 4

In the next lemma we consider independent and identically distributed random vectors $(\xi ,\,\eta )$ and $(\xi ',\, \eta ')$ with values in ${\mathbb R}^2$ and the symmetrization $(\xi _s,\eta _s)$ where $\xi _s=\xi -\xi '$ and $\eta _s=\eta -\eta '$. Note that in the main text we apply this lemma to $\xi =g(X_1)$ and $\eta =N^{-1/2}\sum _{j=m+1}^N\psi (X_1,Y_j)$.

Lemma 12

Let $0<\nu <1/2$ and $r>2$. Assume that $\mathbf{E}|\xi |^r+\mathbf{E}|\eta |^r<\infty $. The following statements hold.

(a) For $c_r=(7/12)2^{-r}$ the conditions

$$\begin{aligned} |t|^{r-2}\mathbf{E}|\xi _s|^r\le c_r\mathbf{E}\xi _s^2, \qquad \mathbf{E}\xi _s\eta _s=0, \qquad \mathbf{E}|\eta _s|^r\le c_r\mathbf{E}\eta _s^2 \end{aligned}$$

imply $1-|\mathbf{E}\exp \{i(t\xi +\eta )\}|^2\ge 6^{-1}(t^2\mathbf{E}\xi _s^2+\mathbf{E}\eta _s^2)$.

(b) Assume that for some ${\tilde{c}}_1, {\tilde{c}}_2 >0$ we have

$$\begin{aligned} \mathbf{E}\xi _s^2/12-N^{-1}\mathbf{E}\eta _s^2>{\tilde{c}}_1^2 \quad {\text {and}} \quad c_r\mathbf{E}\xi _s^2/\mathbf{E}|\xi _s|^r\ge {\tilde{c}}_2^{r-2}. \end{aligned}$$

(253)

Let $\varepsilon >0$ be such that

$$\begin{aligned} \varepsilon<1/6{\tilde{c}}_3, \qquad \varepsilon ^{(r-2)/2}<\sigma _z^2/4A, \qquad \varepsilon ^{r-2}<\sigma _z^2/4B, \end{aligned}$$

(254)

where ${\tilde{c}_3}=2+(5/{\tilde{c}}_1)^2\sigma _z^2$ and where the numbers A, B are defined in (265). Here $\sigma _z^2=\mathbf{E}(\xi _s+N^{-1/2}\eta _s)^2$. Assume that for some $0<\delta <{\tilde{c}}_2$ and $\delta '>10\varepsilon ^2$,

$$\begin{aligned} \sup _{\delta<|t|<N^{-\nu +1/2}}|\mathbf{E}e^{it\xi _s}|\le 1-\delta ' \qquad {\text{ a }nd} \qquad \mathbf{E}|\eta _s|\le \delta ' N^{\nu }/2. \end{aligned}$$

(255)

Then for every $T^*$, satisfying $N^{1/2-\nu }\le |T^*|\le N^{\nu +1/2}$, the set

$$\begin{aligned} I^* = \bigl \{ T^*\le t\le T^*+N^{1/2-\nu }: |\mathbf{E}e^{it(\xi +N^{-1/2}\eta )}|^2\ge 1-\varepsilon ^2 \bigr \} \end{aligned}$$

is an interval of size at most $5{\tilde{c}}_1^{-1} \varepsilon $.

Proof

Proof of (a). Invoking the inequality $1-\cos x\ge x^2/2-x^2/24-|x|^r$ and using the simple inequality $|a+b|^r\le 2^{r-1}(|a|^r+|b|^r)$ we obtain

$$\begin{aligned} 1-|\mathbf{E}\exp \{i(t\xi +\eta )\}|^2&= 1-\mathbf{E}\cos (t\xi _s+\eta _s) \\&\ge \frac{11}{24}\mathbf{E}(t\xi _s+\eta _s)^2 - 2^{r-1}(\mathbf{E}|t\xi _s|^r+\mathbf{E}|\eta _s|^r) \\&\ge 6^{-1}(t^2\mathbf{E}\xi _s^2+\mathbf{E}\eta _s^2). \end{aligned}$$

In the last step we use the conditions a).

Proof of (b). Introduce the function $t\rightarrow \tau _t^*=1-|\mathbf{E}e^{it(\xi +N^{-1/2}\eta )}|^2$. Assume that the set $I^*$ is non-empty and choose $s,t\in I^*$, i.e., we have $\tau _t^*,\tau _s^*\le \varepsilon ^2$. Firstly we show that $|s-t|\le 5{\tilde{c}}_1^{-1} \varepsilon $, thus proving the bound for the size of the set $I^*$.

The inequality $1-\cos (x+y)\ge (1-\cos x)/2-(1-\cos y)$ implies

$$\begin{aligned} 1-|\mathbf{E}e^{i(X+Y)}|^2\ge 2^{-1}(1-|\mathbf{E}e^{iX}|^2)-(1-|\mathbf{E}e^{iY}|^2), \end{aligned}$$

(256)

for arbitrary random variables X, Y. Choosing ${\tilde{Y}}=t(\xi +N^{-1/2}\eta )$ and ${\tilde{X}}=(s-t)(\xi +N^{-1/2}\eta )$ shows

$$\begin{aligned} \tau _s^*\ge (1-|\mathbf{E}e^{i{\tilde{X}}}|^2)/2-\tau _t^*. \end{aligned}$$

(257)

Now we show that the inequality $|t-s|>5{\tilde{c}}_1^{-1}\varepsilon $ implies $1-|\mathbf{E}e^{i{\tilde{X}}}|^2>5\varepsilon ^2$, thus, contradicting to our choice $\tau _s^*,\tau _t^*<\varepsilon ^2$ and (257). In what follows the cases of “large” and “small” values of $|t-s|$ are treated separately.

For $5{\tilde{c}}_1^{-1}\varepsilon <|t-s|\le \delta $ we shall apply (256) to ${\tilde{X}}=X+Y$, where $X=(s-t)\xi $ and $Y=(s-t)N^{-1/2}\eta $. Note that the statement a) implies

$$\begin{aligned} 1-|\mathbf{E}e^{iX}|^2 \ge (t-s)^2\mathbf{E}\xi _s^2/6. \end{aligned}$$

(258)

Indeed, in view of the second inequality of (253), the conditions of a) are satisfied for $|t-s|\le \delta \le {\tilde{c}}_2$. Furthermore, we have

$$\begin{aligned} 0 \le 1-|\mathbf{E}e^{iY}|^2 = 1-\cos \bigl (N^{-1/2}(s-t)\eta _s\bigr ) \le (s-t)^2N^{-1}\mathbf{E}\eta _s^2. \end{aligned}$$

(259)

Invoking the bounds (258) and (259) in (256) we obtain

$$\begin{aligned} 1-|\mathbf{E}e^{i{\tilde{X}}}|^2 \ge (s-t)^2\mathbf{E}\xi _s^2/12 - (s-t)^2N^{-1}\mathbf{E}\eta _s^2 \ge {\tilde{c}}_1^2(s-t)^2\ge 25\varepsilon ^2. \end{aligned}$$

In the last step we used (253).

For $\delta <|t-s|\le N^{-\nu +1/2}$ we expand in powers of $a=i(s-t)N^{-1/2}\eta _s$ to get

$$\begin{aligned} 1-|\mathbf{E}e^{i{\tilde{X}}}|^2&= 1-\mathbf{E}\exp \{i(s-t)\xi _s+a\} \ge 1-\mathbf{E}\exp \{i(s-t)\xi _s\} -\mathbf{E}|a| \\&\ge \delta '-\mathbf{E}|(t-s)N^{-1/2}\eta _s| \ge \delta '-N^{-\nu }\mathbf{E}|\eta _s| \\&\ge \delta '/2 \ge 5\varepsilon ^2. \end{aligned}$$

In the last step we applied (255).

Let us prove that $I^*$ is indeed an interval. Assume the contrary, i.e. there exist $s<u<t$ such that $s,t\in I^*$ and $u\notin I^*$. In particular, $\tau _t^*\le \varepsilon ^2<\tau _u^*$. Clearly, we can choose u to be a local maximum (stationary) point of the function $t\rightarrow \tau _t^*$. Denote

$$\begin{aligned} z=\xi _s+N^{-1/2}\eta _s, \qquad \sigma _z^2=\mathbf{E}z^2. \end{aligned}$$

An application of (256) to $Y'=(t-u)(\xi +N^{-1/2}\eta )$ and $X'=u(\xi +N^{-1/2}\eta )$ gives

$$\begin{aligned} \tau _t^*\ge \tau _u^*/2-\bigl (1-\mathbf{E}e^{i(t-u)z}\bigr ) =\tau _u^*/2-\bigl (1-\mathbf{E}\cos (t-u)z\bigr ). \end{aligned}$$

Invoking the inequalities $\tau _t^*\le \varepsilon ^2$ and $1-\cos (t-u)z\le (t-u)^2z^2/2$ we obtain

$$\begin{aligned} \tau _u^*\le 2\varepsilon ^2+(t-u)^2\sigma _z^2\le \varepsilon ^2{\tilde{c}}_3, \qquad {\tilde{c}}_3=2+(5/{\tilde{c}}_1)^2\sigma _z^2. \end{aligned}$$

(260)

Here we used the bound $|t-u|\le |t-s|\le 5\varepsilon /{\tilde{c}}_1$ proved above.

Denoting $y=(t-u)z$ we have $\tau _t^*=1-\mathbf{E}e^{iuz}e^{iy}$. Invoking the expansion $e^{iy}=1+iy+(iy)^2/2+R'$, where $|R'|\le y^2/6+|y|^r$, we obtain

$$\begin{aligned} \tau _t^*=\tau _u^*-i\mathbf{E}ye^{iuz}+2^{-1}\mathbf{E}y^2e^{iuz}+R, \qquad |R|\le \mathbf{E}y^2/6+\mathbf{E}|y|^r=:R_0. \end{aligned}$$

(261)

For a stationary point u we have $0=\frac{\partial }{\partial t}\tau _t^*\bigl |_{t=u}=-i\mathbf{E}z e^{iuz}$. Therefore, $\mathbf{E}ye^{iuz}=0$ and (261) implies

$$\begin{aligned} \tau _t^*\ge \tau _u^*+2^{-1}(t-u)^2\mathbf{E}z^2e^{iuz}-R_0. \end{aligned}$$

Write the right hand side in the form $\tau _u^*+2^{-1}(t-u)^2R_1$, where

$$\begin{aligned} R_1=\mathbf{E}z^2e^{iuz}-3^{-1}\sigma _z^2-2\mathbf{E}|z|^r|t-u|^{r-2}. \end{aligned}$$

Note that the inequality $R_1>0$ contradicts to our assumption $\tau _t^*< \tau _u^*$. We complete the proof by showing that $R_1>0$.

Since the random variable z is symmetric we have $\mathbf{E}z^2\sin uz=0$. Therefore,

$$\begin{aligned} \mathbf{E}z^2e^{iuz}=\mathbf{E}z^2\cos uz=\sigma _z^2-\mathbf{E}z^2(1-\cos uz). \end{aligned}$$

(262)

Given $\lambda >0$ split

$$\begin{aligned} \mathbf{E}z^2(1-\cos uz)= & {} \mathbf{E}z^2(1-\cos uz)\Bigl ({\mathbb I}_{\{z^2<\lambda ^2\}}+{\mathbb I}_{\{z^2\ge \lambda ^2\}}\Bigr ) \nonumber \\\le & {} \lambda ^2\mathbf{E}(1-\cos uz)+2\mathbf{E}|z|^r\lambda ^{2-r}. \end{aligned}$$

(263)

In the last step we used Chebyshev’s inequality. Furthermore, invoking the inequality $\mathbf{E}(1-\cos uz)=\tau _u^*\le {\tilde{c}}_3\varepsilon ^2$, see (260), we obtain from (262) and (263) for $\lambda ^2=\varepsilon ^{-1}\sigma _z^2$

$$\begin{aligned} \mathbf{E}z^2e^{iuz}\ge \sigma _z^2-\varepsilon {\tilde{c}}_3\sigma _z^2 -\varepsilon ^{(r-2)/2}2\mathbf{E}|z|^r \sigma _z^{2-r}. \end{aligned}$$

(264)

Finally, invoking the inequality $|t-u|\le |t-s|\le 5{\tilde{c}}_1^{-1}\varepsilon $ we obtain from (264)

$$\begin{aligned} R_1\ge \sigma _z^2(1-3^{-1}-\varepsilon {\tilde{c}}_3)-\varepsilon ^{(r-2)/2}A-\varepsilon ^{r-2}B, \end{aligned}$$

where for random variable $z=\xi _s+N^{-1/2}\eta _s$ we write

$$\begin{aligned} A=2\mathbf{E}|z|^r\sigma _z^{2-r} \qquad {\text {and}} \qquad B=2\mathbf{E}|z|^r(5/{\tilde{c}}_1)^{r-2}. \end{aligned}$$

(265)

Thus, for $\varepsilon $ satisfying (254) we have $R_1>0$.$\square $

Appendix 5

Let $Z_1,\dots , Z_N$ be independent copies of the $L^r$ valued random element $Z=\{x\rightarrow \psi (x,Y)\}$. Recall that almost surely $\Vert Z\Vert \le N^{\alpha }$. Here $\Vert \cdot \Vert $ denotes the norm of the Banach space $L^r$, where $r>4$ and $1/2>\alpha >0$. Write $M_p=\mathbf{E}|\psi (X_1,X_2)|^p$.

Lemma 13

(i)
Assume that $\Vert \mathbf{E}Z\Vert ^2\le \mathbf{E}\Vert Z\Vert ^2/N$. Then there exists a constant $c(r)>0$ such that for $k\le N$ and $x>c(r)$ we have
$$\begin{aligned} \mathbf{P}\{\Vert Z_1+\dots +Z_k\Vert >k^{1/2} u\,x\} \le \exp \{-2^{-5}x^2(1+xN^{\alpha }/k^{1/2}u)^{-1}\}. \end{aligned}$$
(266)
Here $u^2=\mathbf{E}\Vert Z\Vert ^2$.
(ii)
The following inequalities hold
$$\begin{aligned}&\Vert \mathbf{E}Z\Vert \le M_r/q_N N^{(r-1)\alpha }, \end{aligned}$$
(267)
$$\begin{aligned}&q_N^{-1}(M_2-M_rN^{-(r-2)\alpha }) \le \mathbf{E}\Vert Z\Vert ^2 \le q_N^{-1}(M_r^{2/r}+M_rN^{-(r-2)\alpha }). \end{aligned}$$
(268)

Remark. Assume that

$$\begin{aligned} M_2\ge 2M_rN^{-(r-2)\alpha }, \qquad M_r^2\le (q_N/2)M_2N^\varkappa , \qquad {\text { where}} \qquad \varkappa =2(r-1)\alpha -1. \end{aligned}$$

Then (267) and (268) imply the inequality $\Vert \mathbf{E}Z\Vert ^2\le \mathbf{E}\Vert Z\Vert ^2/N$. Note that $r\alpha >2$ implies $\varkappa >2$. Furthermore, by (44), the probability $q_N>1-M_rN^{-r\alpha }$.

Proof

We derive (i) from Yurinskii’s [36] inequality. Denote $\zeta _k=Z_1+\dots +Z_k$. Using the type$-2$ inequality for an $L^r$ valued random variable $\zeta _k-\mathbf{E}\zeta _k$,

$$\begin{aligned} \mathbf{E}\Vert \zeta _k-\mathbf{E}\zeta _k\Vert ^2\le k {\tilde{c}}(r)\mathbf{E}\Vert Z_1-\mathbf{E}Z_1\Vert ^2, \end{aligned}$$

and the inequality $\Vert Z_1-\mathbf{E}Z_1\Vert ^2\le 2\Vert Z_1\Vert ^2+2\Vert \mathbf{E}Z_1\Vert ^2$, we obtain

$$\begin{aligned} \mathbf{E}\Vert \zeta _k-\mathbf{E}\zeta _k\Vert \le \bigl (\mathbf{E}\Vert \zeta _k-\mathbf{E}\zeta _k\Vert ^2\bigr )^{1/2} \le k^{1/2}c'(r)(u+\Vert \mathbf{E}Z_1\Vert ). \end{aligned}$$

We have

$$\begin{aligned} \mathbf{E}\Vert \zeta _k\Vert&\le \mathbf{E}\Vert \zeta _k-\mathbf{E}\zeta _k\Vert +k\Vert \mathbf{E}Z_1\Vert \\&\le c'(r)k^{1/2}u+k(1+c'(r)k^{-1/2})\Vert \mathbf{E}Z_1\Vert =:\beta _k. \end{aligned}$$

It follows from the inequality $\Vert Z_1\Vert \le N^{\alpha }$ that

$$\begin{aligned} \mathbf{E}\Vert Z_1\Vert ^L\le 2^{-1}L!u^2N^{\alpha (L-2)}, \qquad L=2,3,\dots . \end{aligned}$$

Write $B_k^2=ku^2$. Theorem 2.1 of Yurinskii [36] shows

$$\begin{aligned} \mathbf{P}\{\Vert \zeta _k\Vert \ge x B_k \} \le \exp \{-B\}, \quad B=\frac{{\overline{x}}^2}{8}(1+({\overline{x}}N^\alpha /2B_k))^{-1}, \end{aligned}$$

(269)

provided that ${\overline{x}}=x-\beta _k/B_k>0$.

Since $\beta _k/B_k\le 1+c'(r)(1+k^{-1/2})$ we have, for $x> c(r):=4c'(r)+2$,

$$\begin{aligned} x>2\beta _k/B_k \qquad {\text {and}} \qquad x>{\overline{x}}>x/2. \end{aligned}$$

The latter inequality implies

$$\begin{aligned} B\ge B':= \frac{(x/2)^2}{8}(1+(xN^\alpha /B_k))^{-1}. \end{aligned}$$

Finally, replacing B by $B'$ in (269) we obtain (266).

Let us prove (ii). The mean value $\mathbf{E}Z=\{x\rightarrow \mathbf{E}\psi (x,Y)\}$ is an element of $L^r$. For $P_X$ almost all $x\in \mathcal X$ we have $\mathbf{E}\psi (x,X)=0$. Therefore,

$$\begin{aligned} \mathbf{E}Z=q_N^{-1}\mathbf{E}\psi (x,X){\mathbb I}_A = q_N^{-1}\mathbf{E}\psi (x,X)({\mathbb I}_A-1). \end{aligned}$$

Invoking (43) and using Chebyshev and Hölder inequalities, we obtain, for $P_X$ almost all x,

$$\begin{aligned} |\mathbf{E}Z| \le \frac{1}{q_N N^{\alpha (r-1)}} \mathbf{E}\Vert Z'\Vert _r^{r-1}|\psi (x,X)| \le \frac{1}{q_N N^{\alpha (r-1)}} (\mathbf{E}\Vert Z'\Vert _r^{r})^{(r-1)/r} a(x), \end{aligned}$$

where $a(x)=(\mathbf{E}|\psi (x,X)|^{r})^{1/r}$. Note that $\mathbf{E}\Vert Z'\Vert _r^{r}=M_r$ and $\Vert a\Vert ^r=M_r$. Finally,

$$\begin{aligned} \Vert \mathbf{E}Z\Vert \le \Vert a\Vert M_r^{(r-1)/r}/q_N N^{\alpha (r-1)}=M_r/q_N N^{\alpha (r-1)}. \end{aligned}$$

Let us prove (268). Denote $b_p(x)=(\mathbf{E}_{X_1}|\psi (X_1,x)|^p)^{1/p}$. Here $\mathbf{E}_{X_1}$ denotes the conditional expectation given all the random variables, but $X_1$. We have

$$\begin{aligned} \mathbf{E}\Vert Z\Vert ^2 = q_N^{-1}\mathbf{E}{\mathbb I}_A b_r^2(X) = q_N^{-1}\mathbf{E}b_r^2(X) + q_N^{-1}R, \qquad R=\mathbf{E}({\mathbb I}_A-1)b_r^2(X). \end{aligned}$$

(270)

By Hölder’s inequality $b_r(x)\ge b_2(x)$, for $P_X$ almost all x. Therefore,

$$\begin{aligned} M_2=\mathbf{E}b_2^2(X)\le \mathbf{E}b_r^2(X)\le M_r^{2/r}. \end{aligned}$$

(271)

Combining (271) and (270) and the bound $|R|\le M_rN^{-(r-2)\alpha }$ we obtain (268). In order to bound |R| we use (43), $|R| \le N^{-(r-2)\alpha }\mathbf{E}\Vert Z'\Vert _r^{r-2}b_r^2(X)$, and apply Hölder’s inequality,

$$\begin{aligned} \mathbf{E}\Vert Z'\Vert _r^{r-2}b_r^2(X) \le (\mathbf{E}\Vert Z'\Vert _r^r)^{(r-2)/r}(\mathbf{E}b_r^r(X))^{2/r}=M_r. \end{aligned}$$

The lemma is proved.$\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bloznelis, M., Götze, F. Edgeworth approximations for distributions of symmetric statistics. Probab. Theory Relat. Fields 183, 1153–1235 (2022). https://doi.org/10.1007/s00440-022-01144-x

Download citation

Received: 29 March 2021
Revised: 29 January 2022
Accepted: 07 May 2022
Published: 17 June 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s00440-022-01144-x

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Edgeworth approximations for distributions of symmetric statistics

Abstract

Similar content being viewed by others

Asymptotic Analysis of Symmetric Functions

Stochastic Decompositions of Unbiased Estimators for Basic One-Parameter Distributions from the Exponential Family

Asymptotic approximations for some distributions of ratios

1 Introduction and results

1.1 Introduction

1.2 Results

Theorem 1

Remark 1

Remark 2

Example 1

1.3 Earlier work

Corollary 1

2 Proof of Theorem 1

2.1 Proof highlights

2.2 Outline of the proof

2.3 Hoeffding’s decomposition

2.4 Proof of Theorem 1

3 Large frequencies

3.1 Notation

3.2 Proof of (27)

3.3 Proof of (60)

3.4 Proof of (26) for \(k = 3\)

4 Combinatorial concentration bound

4.1 Lemma 1

Lemma 1

Proof

4.2 Lemma 2

Lemma 2

Proof

5 Expansions

5.1 Proof of the first relation of (128)

5.2 Proof of the second relation of (128)

5.2.1 Proof of (139)

5.2.2 Proof of (140)

5.2.3 Completion of the proof of (128)

Lemma 3

Proof

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1

Lemma 4

Proof

Lemma 5

Proof

Lemma 6

Proof

Appendix 2

Lemma 7

Proof

Appendix 3

Lemma 8

Proof

Lemma 9

Proof

Lemma 10

Proof

Lemma 11

Appendix 4

Lemma 12

Proof

Appendix 5

Lemma 13

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation