Abstract
This paper develops a quantitative version of de Jong’s central limit theorem for homogeneous sums in a high-dimensional setting. More precisely, under appropriate moment assumptions, we establish an upper bound for the Kolmogorov distance between a multi-dimensional vector of homogeneous sums and a Gaussian vector so that the bound depends polynomially on the logarithm of the dimension and is governed by the fourth cumulants and the maximal influences of the components. As a corollary, we obtain high-dimensional versions of fourth-moment theorems, universality results and Peccati–Tudor-type theorems for homogeneous sums. We also sharpen some existing (quantitative) central limit theorems by applications of our result.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(\varvec{X}=(X_i)_{i=1}^\infty \) be a sequence of independent centered random variables with unit variance. A homogeneous sum is a random variable of the form
where \(N,q\in {\mathbb {N}}\), \([N]:=\{1,\dots ,N\}\) and \(f:[N]^q\rightarrow {\mathbb {R}}\) is a symmetric function vanishing on diagonals, i.e., \(f(i_1,\dots ,i_q)=0\) unless \(i_1,\dots ,i_q\) are mutually different. Studies of limit theorems for a sequence of homogeneous sums have some history in probability theory. Rotar’ [53, 54] investigated invariance principles for \(Q(f;\varvec{X})\) regarding the law of \(\varvec{X}\). In the notable work of de Jong [22], the following striking result has been established: For every \(n\in {\mathbb {N}}\), let \(f_n:[N_n]^q\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals with q fixed and \(N_n\uparrow \infty \) as \(n\rightarrow \infty \). Assume \(\mathrm {E}[X_i^4]<\infty \) for all i and \(\mathrm {E}[Q(f_n;\varvec{X})^2]=1\) for all n. Then, \(Q(f_n;\varvec{X})\) converges in law to the standard normal distribution, provided that the following two conditions hold true:
-
(i)
\(\mathrm {E}[Q(f_n;\varvec{X})^4]\rightarrow 3\) as \(n\rightarrow \infty \).
-
(ii)
\(\max _{1\le i\le N_n}{{\,\mathrm{Inf}\,}}_i(f_n)\rightarrow 0\) as \(n\rightarrow \infty \), where \({{\,\mathrm{Inf}\,}}_i(f_n)\) is defined by
$$\begin{aligned} {{\,\mathrm{Inf}\,}}_i(f_n):=\sum _{i_2,\dots ,i_q=1}^{N_n}f_n(i,i_2,\dots ,i_q)^2 \end{aligned}$$(1.1)and called the influence of the ith variable of \(f_n\).
When \(q=1\), condition (ii) says that \(\max _{1\le i\le N_n}f_n(i)^2\rightarrow 0\) as \(n\rightarrow \infty \), which is equivalent to the celebrated Lindeberg condition. In this case condition (i) is always implied by (ii), and thus it is an extra one. In contrast, when \(q\ge 2\), condition (ii) is no longer sufficient for the asymptotic normality of the sequence \((Q(f_n;\varvec{X}))_{n=1}^\infty \), so one needs an additional condition. The motivation of introducing condition (i) in [22] was that one can easily check condition (i) is equivalent to the asymptotic normality of \((Q(f_n;\varvec{X}))_{n=1}^\infty \) when \(q=2\) and \(\varvec{X}\) is Gaussian (see also [21]). Later on, this observation was significantly improved in the influential paper by Nualart & Peccati [49]: For any q, the asymptotic normality of \((Q(f_n;\varvec{X}))_{n=1}^\infty \) is implied just by condition (i) as long as \(\varvec{X}\) is Gaussian. Results of this type are nowadays called fourth-moment theorems and have been extensively studied in the past decade. In particular, further investigation of the fourth-moment theorem in [49] has led to the introduction of the so-called Malliavin–Stein method by Nourdin & Peccati [43], which have produced one of the most active research areas in the recent probabilistic literature. We refer the reader to the monograph [44] for an introduction to this subject and the survey [3] for recent developments.
Implication of the Malliavin–Stein method to de Jong’s central limit theorem (CLT) for homogeneous sums has been investigated in the seminal work of Nourdin, Peccati & Reinert [47], where several important extensions of de Jong’s result have been developed. The following three results are particularly relevant to our work:
-
(I)
First, they have established a multi-dimensional extension of de Jong’s CLT which shows multi-dimensional vectors of homogeneous sums enjoy a CLT if de Jong’s criterion is satisfied component-wise. More precisely, let \(d\in {\mathbb {N}}\) and, for every \(j=1,\dots ,d\), let \(q_j\in {\mathbb {N}}\) and \(f_{n,j}:[N_n]^{q_j}\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals. Also, let \(\mathfrak {C}=(\mathfrak {C}_{jk})_{1\le j,k\le d}\) be a \(d\times d\) positive semidefinite symmetric matrix and suppose that
$$\begin{aligned} \max _{1\le j,k\le d}|\mathrm {E}[Q(f_{n,j};\varvec{X})Q(f_{n,k};\varvec{X})]-\mathfrak {C}_{jk}|\rightarrow 0 \end{aligned}$$as \(n\rightarrow \infty \). Then, the d-dimensional random vector
$$\begin{aligned} \varvec{Q}^{(n)}(\varvec{X}):=(Q(f_{n,1};\varvec{X}),\dots ,Q(f_{n,d};\varvec{X})) \end{aligned}$$converges in law to the d-dimensional normal distribution \({\mathcal {N}}_d(0,\mathfrak {C})\) with mean 0 and covariance matrix \(\mathfrak {C}\) as \(n\rightarrow \infty \) if \(\mathrm {E}[Q(f_{n,j};\varvec{X})^4]-3\mathrm {E}[Q(f_{n,j};\varvec{X})^2]^2\rightarrow 0\) and \(\max _{1\le i\le N_n}{{\,\mathrm{Inf}\,}}_i(f_{n,j})\rightarrow 0\) as \(n\rightarrow \infty \) for every \(j=1,\dots ,d\).
-
(II)
Second, they have found the following universality of Gaussian variables in the context of homogeneous sums ( [47, Theorem 1.2]): Assume
$$\begin{aligned} \sup _n\sum _{i_1,\dots ,i_q=1}^{N_n}f_{n,j}(i_1,i_2,\dots ,i_{q_j})^2<\infty \end{aligned}$$and \(\mathfrak {C}_{jj}>0\) for every j. Then, if \(\varvec{Q}^{(n)}(\varvec{G})\) converges in law to \({\mathcal {N}}_d(0,\mathfrak {C})\) as \(n\rightarrow \infty \) for a sequence of standard Gaussian variables \(\varvec{G}=(G_i)_{i=1}^\infty \), then \(\varvec{Q}^{(n)}(\varvec{X})\) converges in law to \({\mathcal {N}}_d(0,\mathfrak {C})\) as \(n\rightarrow \infty \) for any sequence \(\varvec{X}=(X_i)_{i=1}^\infty \) of independent centered random variables with unit variance and such that \(\sup _i\mathrm {E}[|X_i|^3]<\infty \).
-
(III)
Third, they have established some quantitative versions of de Jong’s CLT for homogeneous sums; see Proposition 5.4 and Corollary 7.3 in [47] for details (see also Sect. 2.1.1).
We remark that these results have been generalized in various directions by subsequent studies. For example, the universality results analogous to (II) have also been established for Poisson variables in Peccati & Zheng [52] and i.i.d. variables with zero skewness and nonnegative excess kurtosis in Nourdin et al. [45, 46], respectively. Also, the recent work of Döbler & Peccati [28] has extended (I) and (II) to more general degenerate U-statistics which were originally treated in [22].
As the title of the paper suggests, the aim of this paper is to extend the above results to a high-dimensional setting where the dimension d depends on n and \(d=d_n\rightarrow \infty \) as \(n\rightarrow \infty \). Of course, in such a setting, the “asymptotic distribution” \(\mathcal {N}_d(0,\mathfrak {C})\) also depends on n and, even worse, it is typically no longer tight. Therefore, we need to properly reformulate the above statements in this setting. In this paper, we adopt the so-called metric approach to accomplish this purpose: We try to establish the convergence of some metric between the laws of \(\varvec{Q}^{(n)}(\varvec{X})\) and \(\mathcal {N}_d(0,\mathfrak {C})\). Specifically, we take the Kolmogorov distance as the metric between the probability laws. Namely, letting \(Z^{(n)}\) be a \(d_n\)-dimensional centered Gaussian vector with covariance matrix \(\mathfrak {C}_n\) for each n, we aim at proving the following convergence:
Here, for vectors \(x=(x_1,\dots ,x_{d_n})\in {\mathbb {R}}^{d_n}\) and \(y=(y_1,\dots ,y_{d_n})\in {\mathbb {R}}^{d_n}\), we write \(x\le y\) to express \(x_j\le y_j\) for every \(j=1,\dots ,d_n\). In addition, we are particularly interested in a situation where the dimension \(d=d_n\) increases extremely faster than the “standard” convergence rate of Gaussian approximation for a sequence of univariate homogeneous sums. Given that both \(\sqrt{|\mathrm {E}[Q(f_n;\varvec{X})^4]-3\mathrm {E}[Q(f_{n};\varvec{X})^2]^2|}\) and \(\max _{1\le i\le N_n}\sqrt{{{\,\mathrm{Inf}\,}}_i(f_n)}\) can be the optimal convergence rates of the Gaussian approximation of \(Q(f_{n};\varvec{X})\) in the Kolmogorov distance (see [42, Proposition 3.8] for the former and [30, Remark 1] for the latter), we might consider the quantity
as an appropriate definition of the “standard” convergence rate. Then, we aim at proving
for all \(n\in {\mathbb {N}}\), where \(a,b,C>0\) are constants which do not depend on n (here and below we assume \(d_n\ge 2\)). As a byproduct, results of this type enable us to extend fourth-moment theorems and universality results for homogeneous sums to a high-dimensional setting (see Theorem 2.2 for the precise statement).
Our formulation of a high-dimensional extension of CLTs for homogeneous sums is motivated by the recent path-breaking work of Chernozhukov, Chetverikov & Kato [13, 18], where results analogous to (1.2) have been established for sums of independent random vectors. More formally, let \((\xi _{n,i})_{i=1}^n\) be a sequence of independent centered \(d_n\)-dimensional random vectors. Set \(S_n:=n^{-1/2}\sum _{i=1}^n\xi _{n,i}\) and assume \(\mathfrak {C}_n=\mathrm {E}[S_nS_n^\top ]\) (\(\top \) denotes the transpose of a matrix). Then, under an appropriate assumption on moments, we have
where \(C'>0\) is a constant which does not depend on n (see Proposition 2.1 in [18] for the precise statement). Here, we shall remark that the bound in (1.3) depends on n through \(n^{-1/6}\), which is suboptimal when the dimension \(d_n\) is fixed. However, in [18, Remark 2.1(ii)] it is conjectured that the rate \(n^{-1/6}\) is nearly optimal in a minimax sense when \(d_n\) is extremely larger than n (see also [10, Remark 1]). This conjecture is motivated by the fact that the rate \(n^{-1/6}\) is minimax optimal in CLTs for sums of independent random variables taking values in an infinite-dimensional Banach space (see, e.g., [8, Theorem 2.6]). Given that high-dimensional CLTs of type (1.3) are closely related to Gaussian approximation of the suprema of empirical processes (see, e.g., [15, 17]), it would be worth mentioning that a duality argument enables us to translate the minimax rate for CLTs in a Banach space to the one for Gaussian approximation of the suprema of empirical processes with a specific class of functions in the Kolmogorov distance; see [50] for details. For this reason, we also conjecture that \(b=1/3\) would give an optimal dependence on \(\delta _n\) of the bound in (1.2) (note that the rate \(n^{-1/2}\) is the standard convergence rate of CLTs for sums of independent one-dimensional random variables). In this paper, we indeed establish that the bound of type (1.2) holds true with \(b=1/3\) under a moment assumption on \(\varvec{X}\) when \(q_j\)’s do not depend on j (see Theorem 2.1 and Remark 2.1).
We remark that there are a number of articles which extend the scope of the Chernozhukov–Chetverikov–Kato theory (CCK theory for short) in various directions. We refer the reader to the survey [6] for recent developments. Nevertheless, most studies focus on linear statistics (i.e., sums of random variables) and there are only a few articles concerned with nonlinear statistics. Two exceptions are U-statistics developed in [10,11,12, 56] and Wiener functionals developed in [34, 35]. On the one hand, however, the former are mainly concerned with non-degenerate U-statistics which are approximately linear statistics via Hoeffding decomposition (Chen & Kato [11] also handle degenerate U-statistics, but they focus on the randomized incomplete versions that are still approximately linear statistics). On the other hand, although the latter deal with essentially nonlinear statistics, they must be functionals of a (possibly infinite-dimensional) Gaussian process, except for [35, Theorem 3.2] that is a version of our result with \(q_j\equiv 2\) (see Sect. 2.1.2 for more details). In this sense, our result would be the first extension of CCK-type results to essentially nonlinear statistics based on possibly non-Gaussian variables.
Finally, we mention that the main results of this paper have potential applications to statistics. In fact, the original motivation of this paper is to improve the Gaussian approximation result for maxima of high-dimensional vectors of random quadratic forms given by [35, Theorem 3.2], which is used to ensure the validity of the bootstrap testing procedure proposed in [35, Section 4.1] (see Sect. 2.2). Another potential application might be specification test for parametric form in nonparametric regression. In this area, to derive the null distributions of test statistics, one sometimes needs to approximate the maximum of (essentially degenerate) quadratic forms; see [25, 32, 40] for instance.
This paper is organized as follows. Section 2 presents the main results obtained in the paper, while Sects. 3–7 are devoted to the proof of the main results: Sect. 3 demonstrates a basic scheme of the CCK theory to prove high-dimensional CLTs. Subsequently, Sect. 4 presents a connection of this scheme to Stein’s method. Based on this observation, Sect. 5 develops a high-dimensional CLT of the form (1.2) for homogeneous sums based on normal and gamma variables. Then, Sect. 6 establishes a kind of invariance principle for high-dimensional homogeneous sums using a randomized version of the Lindeberg method. Finally, Sect. 7 completes the proof of the main results.
2 Notation
\({\mathbb {Z}}_+\) denotes the set of all nonnegative integers. For \(x=(x_1,\dots ,x_d)\in {\mathbb {R}}^d\), we define \(\Vert x\Vert _{\ell _\infty }:=\max _{1\le j\le d}|x_j|\). For \(N\in {\mathbb {N}}\), we set \([N]:=\{1,\dots ,N\}\). We set \(\sum _{i=p}^q\equiv 0\) if \(p>q\) by convention. For \(q\in {\mathbb {N}}\), we denote by \(\mathfrak {S}_q\) the set of all permutations of [q], i.e., the symmetric group of degree q. For a function \(f:[N]^q\rightarrow {\mathbb {R}}\), we set \({\mathcal {M}}(f):=\max _{1\le i\le N}{{\,\mathrm{Inf}\,}}_i(f)\) (recall that \({{\,\mathrm{Inf}\,}}_i(f)\) is defined according to (1.1)). We also set \( \Vert f\Vert _{\ell _2}:=\sqrt{\sum _{i_1,\dots ,i_q=1}^Nf(i_1,\dots ,i_q)^2}. \) For a function \(h:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\), we set \(\Vert h\Vert _\infty :=\sup _{x\in {\mathbb {R}}^d}|h(x)|\). We write \(C^m_b({\mathbb {R}}^d)\) for the set of all real-valued \(C^m\) functions on \({\mathbb {R}}^d\) all of whose partial derivatives are bounded. We write \(\partial _{j_1\dots j_m}=\frac{\partial ^m}{\partial x_{j_1}\cdots \partial x_{j_m}}\) for short. Throughout the paper, \(Z=(Z_1,\dots ,Z_d)\) denotes a d-dimensional centered Gaussian random vector with covariance matrix \({\mathfrak {C}}=({\mathfrak {C}}_{ij})_{1\le i,j\le d}\) (note that we do not assume that \(\mathfrak {C}\) is positive definite in general). Also, \((q_j)_{j=1}^\infty \) stands for a sequence of positive integers. Throughout the paper, we will regard \((q_j)_{j=1}^\infty \) as fixed, i.e., it does not vary when we consider asymptotic results. Given a probability distribution \(\mu \), we write \(X\sim \mu \) to express that X is a random variable with distribution \(\mu \). For \(\nu >0\), we write \(\gamma (\nu )\) for the gamma distribution with shape \(\nu \) and rate 1. If S is a topological space, \(\mathcal {B}(S)\) denotes the Borel \(\sigma \)-field of S.
Given a random variable X, we set \(\Vert X\Vert _p:=\{\mathrm {E}[|X|^p]\}^{1/p}\) for every \(p>0\). When X satisfies \(\mathrm {E}[X^4]<\infty \), we denote the fourth cumulant of X by \(\kappa _4(X)\). Note that \(\kappa _4(X)=\mathrm {E}[X^4]-3\mathrm {E}[X^2]^2\) if X is centered. For \(\alpha >0\), we define the \(\psi _\alpha \)-norm of X by \( \Vert X\Vert _{\psi _\alpha }:=\inf \{C>0:\mathrm {E}[\psi _\alpha (|X|/C)]\le 1\}, \) where \(\psi _\alpha (x):=\exp (x^\alpha )-1\). Note that \(\Vert \cdot \Vert _{\psi _\alpha }\) is indeed a norm (on a suitable space) if and only if \(\alpha \ge 1\). Some useful properties of the \(\psi _\alpha \)-norm are collected in Appendix A.
3 Main Results
Our first main result is a high-dimensional version of de Jong’s CLT for homogeneous sums:
Theorem 2.1
Let \(\varvec{X}=(X_i)_{i=1}^N\) be a sequence of independent centered random variables with unit variance. Set \(w=\frac{1}{2}\) if \(\mathrm {E}[X_i^3]=0\) for every \(i\in [N]\) and \(w=1\) otherwise. For every \(j\in [d]\), let \(f_j:[N]^{q_j}\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals, and set \(\varvec{Q}(\varvec{X}):=(Q(f_1;\varvec{X}),\dots ,Q(f_d;\varvec{X}))\). Suppose that \(d\ge 2\), \(\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0\) and \(\max _{1\le i\le N}\Vert X_i\Vert _{\psi _\alpha }<\infty \) for some \(\alpha \in (0,w^{-1}]\). Then,
where \(\overline{B}_N:=\max _{1\le i\le N}(\Vert X_i\Vert _{\psi _\alpha }\vee |\mathrm {E}[X_i^3]|)\), \(\overline{q}_d:=\max _{1\le j\le d}q_j\), \(\mu :=\max \{\frac{2}{3}w\overline{q}_d-\frac{1}{6},\frac{2(\overline{q}_d-1)}{3\alpha }+\frac{1}{3}\}\), \(C>0\) depends only on \(\alpha ,\overline{q}_d\) and
with \(\overline{A}_N:=\max _{1\le i\le N}(|\mathrm {E}[X_i^3]|\vee {\Vert X_i\Vert _4})\).
Remark 2.1
-
(a)
Since \(\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2\le \Vert f_k\Vert _{\ell _2}^2\mathcal {M}(f_k)\), Theorem 2.1 gives the bound of the form (1.2) under reasonable assumptions when \(q_1=\cdots =q_d\). For example, this is the case when \(\mathrm {E}[Q(f_j;\varvec{X})Q(f_k;\varvec{X})]={\mathfrak {C}}_{jk}\) for all \(j,k\in [d]\), \(\sup _i\Vert X_i\Vert _{\psi _\alpha }<\infty \) and \(\sup _j\Vert f_j\Vert _{\ell _2}<\infty \). Here, we keep \(\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2\) rather than \(\Vert f_k\Vert _{\ell _2}^2\mathcal {M}(f_k)\) for the convenience of application.
-
(b)
When \(q_j<q_k\) for some \(j,k\in [d]\), the exponents of \(|\kappa _4(Q(f_k;\varvec{X}))|\) and \(\mathcal {M}(f_k)\) appearing in the bound of (2.1) are 1/12, which are halves of those for the case \(q_j=q_k\). This phenomenon is not specific to the high-dimensional setting but common in fourth-moment-type theorems. See Remark 1.9(a) in [29] for more details.
-
(c)
In Sect. 2.1, we compare Theorem 2.1 to two existing results in some detail. The results therein show the dependence of the bound in (2.1) on the dimension d is as sharp as (and sometimes sharper than) the previous results.
We can easily extend Theorem 2.1 to a high-dimensional CLT for homogeneous sums in hyperrectangles as follows. Let \({\mathcal {A}}^\mathrm {re}(d)\) be the set of all hyperrectangles in \({\mathbb {R}}^d\), i.e., \({\mathcal {A}}^\mathrm {re}(d)\) consists of all sets A of the form
for some \(-\infty \le a_j\le b_j\le \infty \), \(j=1,\dots ,d\).
Corollary 2.1
Under the assumptions of Theorem 2.1, we have
where \(C'>0\) depends only on \(\alpha ,\overline{q}_d\).
For application, it is often useful to restate Theorem 2.1 in an asymptotic form as follows.
Corollary 2.2
Let \(\varvec{X}=(X_i)_{i=1}^\infty \) be a sequence of independent centered random variables with unit variance. Set \(w=\frac{1}{2}\) if \(\mathrm {E}[X_i^3]=0\) for every \(i\in {\mathbb {N}}\) and \(w=1\) otherwise. For every \(n\in {\mathbb {N}}\), let \(N_n,d_n\in {\mathbb {N}}\setminus \{1\}\) and \(f_{n,k}:[N_n]^{q_{ k}}\rightarrow {\mathbb {R}}\) (\(k=1,\dots ,d_n\)) be symmetric functions vanishing on diagonals, and set \(\varvec{Q}^{(n)}(\varvec{X}):=(Q(f_{n,1};\varvec{X}),\dots ,Q(f_{n,d};\varvec{X}))\). Moreover, for every \(n\in {\mathbb {N}}\), let \(Z^{(n)}=(Z_{n,1},\dots ,Z_{n,d_n})\) be a \(d_n\)-dimensional centered Gaussian vector with covariance matrix \(\mathfrak {C}_n=(\mathfrak {C}_{n,kl})_{1\le k,l\le d_n}\). Suppose that \(\overline{q}_\infty :=\sup _{j\in {\mathbb {N}}}q_j<\infty \), \(\inf _{n\in {\mathbb {N}}}\min _{1\le k\le d_n}\Vert Z_{n,k}\Vert _2>0\), \(\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _\alpha }<\infty \) for some \(\alpha \in (0,w^{-1}]\) and
as \(n\rightarrow \infty \). Moreover, setting \(a_1:=(4w\overline{q}_\infty -2)\vee (4\alpha ^{-1}(\overline{q}_\infty -1)+5)\) and \(a_2:=2\alpha ^{-1}(2\overline{q}_\infty -1)+3\), we suppose that either one of the following conditions is satisfied:
-
(i)
\((\log d_n)^{2a_1}\max _{1\le j\le d_n}|\kappa _4(Q(f_{n,j};\varvec{X}))|\rightarrow 0\) and \((\log d_n)^{2a_1\vee a_2}\max _{1\le j\le d_n}{\mathcal {M}}(f_{n,j})\rightarrow 0\) as \(n\rightarrow \infty \).
-
(ii)
\((\log d_n)^{a_1}\max _{1\le j\le d_n}|\kappa _4(Q(f_{n,j};\varvec{X}))|\rightarrow 0\) and \((\log d_n)^{a_1\vee a_2}\max _{1\le j\le d_n}{\mathcal {M}}(f_{n,j})\rightarrow 0\) as \(n\rightarrow \infty \) and \(q_1=q_2=\cdots \).
Then, we have \(\sup _{A\in {\mathcal {A}}^\mathrm {re}(d_n)}|P(\varvec{Q}^{(n)}(\varvec{X})\in A)-P(Z^{(n)}\in A)|\rightarrow 0\) as \(n\rightarrow \infty \).
Our second main result gives high-dimensional versions of fourth-moment theorems, universality results and Peccati–Tudor-type theorems for homogeneous sums:
Theorem 2.2
Let us keep the same notation as in Corollary 2.2. Suppose that one of the following conditions is satisfied:
-
(A)
\(\varvec{X}\) is a sequence of independent copies of a random variable X such that \(\Vert X\Vert _{\psi _\alpha }<\infty \) for some \(\alpha >0\) and \(\mathrm {E}[X^3]=0\) and \(\mathrm {E}[X^4]\ge 3\).
-
(B)
For every i, \(X_i\) is a standardized Poisson random variable with intensity \(\lambda _i>0\), i.e., \(\lambda _i+\sqrt{\lambda _i}X_i\) is a Poisson random variable with intensity \(\lambda _i\). Moreover, \(\inf _{i\in {\mathbb {N}}}\lambda _i>0\).
-
(C)
For every i, \(X_i\) is a standardized gamma random variable with shape \(\nu _i>0\) and unit rate, i.e., \(\nu _i+\sqrt{\nu _i}X_i\sim \gamma (\nu _i)\). Moreover, \(\inf _{i\in {\mathbb {N}}}\nu _i>0\).
Suppose also \(2\le \inf _{j\in {\mathbb {N}}}q_j\le \sup _{j\in {\mathbb {N}}}q_j<\infty \), \(0<\inf _{n\in {\mathbb {N}}}\min _{1\le j\le d_n}\mathfrak {C}^{(n)}_{jj}\le \sup _{n\in {\mathbb {N}}}\max _{1\le j\le d_n}\mathfrak {C}^{(n)}_{jj}<\infty \) and
as \(n\rightarrow \infty \) for every \(a>0\). Then, we have \(\kappa _4(Q(f;\varvec{X}))\ge 0\) for any symmetric function \(f:[N]^q\rightarrow {\mathbb {R}}\) vanishing on diagonals. Moreover, the following conditions are equivalent:
-
(i)
\((\log d_n)^a\max _{1\le j\le d_n}\kappa _4(Q(f_{n,j};\varvec{X}))\rightarrow 0\) as \(n\rightarrow \infty \) for every \(a>0\).
-
(ii)
\((\log d_n)^a\max _{1\le j\le d_n}\sup _{x\in {\mathbb {R}}}|P(Q(f_{n,j};\varvec{X})\le x)-P(Z_{n,j}\le x)|\rightarrow 0\) as \(n\rightarrow \infty \) for every \(a>0\).
-
(iii)
\((\log d_n)^a\sup _{x\in {\mathbb {R}}^{d_n}}|P(\varvec{Q}^{(n)}(\varvec{X})\le x)-P(Z^{(n)}\le x)|\rightarrow 0\) as \(n\rightarrow \infty \) for every \(a>0\).
-
(iv)
\((\log d_n)^a\sup _{x\in {\mathbb {R}}^{d_n}}|P(\varvec{Q}^{(n)}(\varvec{Y})\le x)-P(Z^{(n)}\le x)|\rightarrow 0\) as \(n\rightarrow \infty \) for any \(a>0\) and sequence \(\varvec{Y}=(Y_i)_{i\in {\mathbb {N}}}\) of centered independent variables with unit variance such that \(\sup _{i\in {\mathbb {N}}}\Vert Y_i\Vert _{\psi _\alpha }<\infty \) for some \(\alpha >0\).
Remark 2.2
-
(a)
The implications (i) \(\Rightarrow \) (iii), (iii) \(\Rightarrow \) (iv) and (ii) \(\Rightarrow \) (iii) can be viewed as high-dimensional versions of fourth-moment theorems, universality results and Peccati–Tudor-type theorems for homogeneous sums, respectively. Here, Peccati–Tudor-type theorems refer to statements such that a joint CLT is implied by component-wise CLTs (Peccati & Tudor [51] have established such a result for multiple Wiener–Itô integrals with respect to an isonormal Gaussian process).
-
(b)
The proof of Theorem 2.2 relies on the fact that condition (i) automatically yields \((\log d_n)^a\max _{j}\mathcal {M}(f_{n,j})\rightarrow 0\) as \(n\rightarrow \infty \) for every \(a>0\). On the one hand, this fact has already been established in the previous work for cases (A) and (B) (see the proof of Lemma 7.2). On the other hand, for case (C), this fact seems not to have appeared in the literature so far. Indeed, for case (C) we obtain it as a byproduct of the proof of Proposition 5.2 (see Lemma 5.4). As a consequence, Theorem 2.2 seems new for case (C) even in the fixed-dimensional case. We remark that the fourth-moment theorem for case (C) has been established by [1] in the univariate case, which inspired our discussions in Sect. 5 (see also [9]).
3.1 Comparison of Theorem 2.1 to Some Existing Results
3.1.1 Comparison to Corollary 7.3 in Nourdin, Peccati and Reinert [47]
First, we compare our result to the quantitative multi-dimensional CLT for homogeneous sums obtained in Nourdin et al. [47]. To state their result, we need to introduce the notion of contraction, which will also play an important role in Sect. 5.2. For two symmetric functions \(f:[N]^p\rightarrow {\mathbb {R}},g:[N]^q\rightarrow {\mathbb {R}}\) and \(r\in \{0,1\dots ,p\wedge q\}\), we define the contraction \(f\star _rg:[N]^{p+q-2r}\rightarrow {\mathbb {R}}\) by
In particular, we have
Now we are ready to state the result of [47]. To simplify the notation, we focus only on the identity covariance matrix case and do not keep the explicit dependence of constants on \(q_j\)’s.
Proposition 2.1
(Nourdin et al. [47], Corollary 7.3) Let us keep the same notation as in Theorem 2.1. Suppose that \(\mathfrak {C}_{jk}=\mathrm {E}[Q(f_j;\varvec{X})Q(f_k;\varvec{X})]\) for all \(j,k\in [d]\) and \(\mathfrak {C}\) is the identity matrix of size d. Suppose also that \(\beta :=\max _{1\le i\le N}\mathrm {E}[|X_i|^3]<\infty \) and \(q_d\ge \cdots \ge q_1\ge 2\). Then, we have
where \({\mathcal {C}}({\mathbb {R}}^{d})\) is the set of all convex Borel subsets of \({\mathbb {R}}^d\), \(K>0\) is a constant depending only on \(\overline{q}_d\), \({\mathsf {C}}:=\sum _{i=1}^N\max _{1\le j\le d}{{\,\mathrm{Inf}\,}}_i(f_j)\) and
To compare Proposition 2.1 to our result, we need to bound the quantity \(\overline{\Delta }\) by \(|\kappa _4(Q(f_j;\varvec{X}))|\) and \(\mathcal {M}(f_j)\), \(j\in [d]\). This can be carried out by the following lemma (proved in Sect. 7.4):
Lemma 2.1
Let \(\varvec{X}=(X_i)_{i=1}^N\) be a sequence of independent centered random variables with unit variance and such that \(M:=1+\max _{1\le i\le N}\mathrm {E}[X_i^4]<\infty \). Also, let \(q\ge 2\) be an integer and \(f:[N]^q\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals. Then, we have
where \(C>0\) depends only on q.
Remark 2.3
The bound in Lemma 2.1 is generally sharp. In fact, it is well known that \(\sqrt{|\kappa _4(Q(f;\varvec{X}))|}\) has the same order as \(\max _{1\le r\le q-1}\Vert f\star _r f\Vert _{\ell _2}\) if \(\varvec{X}\) is Gaussian (see, e.g., Eq.(5.2.6) in [44]). Moreover, if \(q=2\) and \(f(i,j)=N^{-1/2}1_{\{|i-j|=1\}}\), then both \(\Vert f\star _1f\Vert _{\ell _2}\) and \(\Vert f\Vert _{\ell _2}\sqrt{\mathcal {M}(f)}\) are of order \(N^{-1/2}\).
With the help of Lemma 2.1, we observe that the bound in (2.4) typically has the same order as
where
Thus, in the bound of (2.4), the dimension appears as a power of d, while the exponent of the “standard” convergence rate \(\delta :=\max _{1\le j\le d}\sqrt{|\kappa _4(Q(f_j;\varvec{X}))|+\mathcal {M}(f_j)}\) is 1/4. These are much improved in our result because the former appears as a power of \(\log d\) and the latter is 1/3. Nevertheless, we should note that the bound in (2.4) is given for the much stronger metric than the Kolmogorov distance. In fact, to the best of the author’s knowledge, all the known bounds for this metric depend polynomially on the dimension even for sums of independent random variables; see [58, Section 1.1] and references therein.
Remark 2.4
-
(a)
Roughly speaking, the exponent of \(\delta \) is 1/4 in the bound of (2.4) because this bound is transferred from an analogous quantitative CLT for the Gaussian counterpart by the Lindeberg method with matching moments up to the second order. To overcome this issue, we need to match moments up to the third order and thus we can no longer rely on the result analogous to Theorem 2.1 for the Gaussian counterpart, which is obtained in [35]. For this reason, we will develop a high-dimensional CLT for homogeneous sums based on normal and gamma variables in Sect. 5.
-
(b)
It is worth noting that the quantity \({\mathsf {C}}=\sum _{i=1}^N\max _{1\le j\le d}{{\,\mathrm{Inf}\,}}_i(f_j)\) in the bound of (2.4) can be much larger than \(\max _{1\le j\le d}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_j)=\max _{1\le j\le d}\Vert f_j\Vert _{\ell _2}^2\) in high-dimensional situations (see Remark 2.5 for a concrete example). Indeed, naïve application of the Lindeberg method produces a quantity like \({\mathsf {C}}\), which prevents us from using the Lindeberg method in its pure form (this is why Chernozhukov et al. [13, 18] rely on Stein’s method to prove their high-dimensional CLTs; see [14, Appendix L] for a detailed discussion). In Sect. 6, we will resolve this issue by randomizing the Lindeberg method as Deng & Zhang [24] have recently done in the context of sums of independent random vectors.
3.1.2 Comparison to Theorem 3.2 in Koike [35]
Next, we compare our result to the Gaussian approximation result for maxima of quadratic forms obtained in [35, Theorem 3.2]. Here, for an explicit comparison, we state this result with applying [35, Corollary 3.1]. For a function \(f:[N]^2\rightarrow {\mathbb {R}}\), we denote the \(N\times N\) matrix \((f(i,j))_{1\le i,j\le N}\) by [f].
Proposition 2.2
(Koike [35], Theorem 3.2 and Corollary 3.1) Let us keep the same notation as in Corollary 2.2. Suppose that \(\inf _{n\in {\mathbb {N}}}\min _{1\le k\le d_n}\Vert Z_{n,k}\Vert _2>0\), \(q_j=2\) for all j, \(\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _2}<\infty \) and (2.2) holds true as \(n\rightarrow \infty \). Suppose also that
as \(n\rightarrow \infty \). Then, we have
as \(n\rightarrow \infty \).
When we apply our result to quadratic forms as above, we obtain the following result.
Proposition 2.3
Let us keep the same notation as in Corollary 2.2. Set
for every n. Assume \(q_1=q_2=\cdots =2\), \(\inf _{n\in {\mathbb {N}}}\min _{1\le k\le d_n}\Vert Z_{n,k}\Vert _2>0\) and (2.2). Assume also that either one of the following conditions is satisfied:
-
(i)
\(\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _1}<\infty \) and \((\log d_n)^5\Delta _n\rightarrow 0\) as \(n\rightarrow \infty \).
-
(ii)
\(\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _2}<\infty \), \(\mathrm {E}[X_i^3]=0\) for all i and \((\log d_n)^3\Delta _n\rightarrow 0\) as \(n\rightarrow \infty \).
Then, we have \(\sup _{A\in {\mathcal {A}}^\mathrm {re}(d_n)}|P(\varvec{Q}^{(n)}(\varvec{X})\in A)-P(Z^{(n)}\in A)|\rightarrow 0\) as \(n\rightarrow \infty \).
Remark 2.5
(a) Regarding the convergence rate of \(\max _{1\le k\le d_n}{{\,\mathrm{tr}\,}}\left( [f_{n,k}]^4\right) \), condition (i) in Proposition 2.3 is stronger than the one in Proposition 2.2. However, the former imposes a weaker moment condition on \(\varvec{X}\) than the latter. More importantly, the second term of \(\Delta _n\) is always smaller than or equal to the second term in (2.5), and the latter can be much larger than the former. For example, let us assume \(N_n=d_n=n\) and consider the functions \(f_{n,k}\) defined as follows:
Then, we have \({{\,\mathrm{Inf}\,}}_i(f_{n,k})=(1+1_{\{1<i<n\}})n^{-1/2}\) if \(i\in \{k,k\pm 1\}\) and \({{\,\mathrm{Inf}\,}}_i(f_{n,k})=(1+1_{\{1<i<n\}})n^{-1}\) otherwise. Therefore, on the one hand
does not converge to 0 as \(n\rightarrow \infty \), but on the other hand \(\max _{1\le k\le d_n}\sqrt{\mathcal {M}(f_{n,k})}\Vert f_{n,k}\Vert _{\ell _2}=O(n^{-1/4})\) as \(n\rightarrow \infty \). Note that in this case we have \(\max _{1\le k\le d_n}\sqrt{{{\,\mathrm{tr}\,}}\left( [f_{n,k}]^4\right) }=O(n^{-1/4})\) and \(\Vert f_{n,k}\Vert _{\ell _2}\rightarrow 1\) as \(n\rightarrow \infty \), so (2.6) holds true due to Proposition 2.3.
(b) Condition (ii) in Proposition 2.3 requires the additional zero skewness assumption, but it always improves the assumption on the functions \(f_{n,k}\) than the one in Proposition 2.2.
(c) We have \(\Delta _n\le 2\max _{k}\Vert [f_{n,k}]\Vert _{\mathrm {sp}}\Vert f_{n,k}\Vert _{\ell _2}\) with \(\Vert \cdot \Vert _{\mathrm {sp}}\) the spectral norm of matrices. So \((\log d_n)^a\Delta _n\rightarrow 0\) for some \(a>0\) is implied by \((\log d_n)^a\max _{k}\Vert [f_{n,k}]\Vert _{\mathrm {sp}}\Vert f_{n,k}\Vert _{\ell _2}\rightarrow 0\).
3.2 Statistical Application: Bootstrap Test for the Absence of Lead–Lag Relationship
Let \(W_t=(W_t^1,W_t^2)\) \((t\in {\mathbb {R}})\) be a two-sided bivariate standard Wiener process. Also let \(\rho \in (-1,1)\) and \(\vartheta \in {\mathbb {R}}\) be two (unknown) parameters. We define the bivariate process \(B_t=(B_t^1,B_t^2)\) \((t\in {\mathbb {R}})\) as \(B_t^1=W^1_t\) and \(B_t^2=\rho W^1_{t-\vartheta }+\sqrt{1-\rho ^2}W^2_t\). For each \(\nu =1,2\), we consider the process \(X^\nu =(X^\nu _t)_{t\ge 0}\) given by
where \(\sigma _\nu \in L^2(0,\infty )\) is nonnegative-valued and deterministic. If \(\rho \ne 0\), there is a correlation between \(X^1\) and \(X^2\) with a time lag of \(\vartheta \). We aim to test for whether such a correlation really exists or not, given (possibly asynchronous) high-frequency observations of \(X^1\) and \(X^2\). Specifically, for each \(\nu =1,2\), we observe the process \(X^\nu \) on the interval [0, T] at the deterministic sampling times \(0\le t^\nu _0<t^\nu _1<\cdots <t^\nu _{n_\nu }\le T\), which implicitly depend on the parameter \(n\in {\mathbb {N}}\) such that
as \(n\rightarrow \infty \), where we set \(t^\nu _{-1}:=0\) and \(t^\nu _{n_\nu +1}:=T\) for each \(\nu =1,2\). To test for the null hypothesis \(H_0:\rho =0\) against the alternative \(H_1:\rho \ne 0\), Koike [35] proposed the test statistic given by \(T_n=\sqrt{n}\max _{\theta \in {\mathcal {G}}_n}|U_n(\theta )|\), where \({\mathcal {G}}_n\) is a finite subset of \({\mathbb {R}}\) and
The null distribution of \(T_n\) can be approximated by its Gaussian analog as follows:
Proposition 2.4
([35], Proposition 4.1) For each \(n\in {\mathbb {N}}\), let \((Z_n(\theta ))_{\theta \in {\mathcal {G}}_n}\) be a family of centered Gaussian variables such that \(\mathrm {E}[Z_n(\theta )Z_n(\theta ')]=n{{\,\mathrm{Cov}\,}}[U_n(\theta ),U_n(\theta ')]\) for all \(\theta ,\theta '\in {\mathcal {G}}_n\). Suppose that \(\sup _{t\in [0,T]}(\sigma _1(t)+\sigma _2(t))<\infty \) and there are positive constants \({\underline{v}},{\overline{v}}\) such that
for all \(n\in {\mathbb {N}}\) and \(\theta \in {\mathcal {G}}_n\). Then, under the null hypothesis \(\rho =0\), we have
as \(n\rightarrow \infty \), provided that \(nr_n^2\log ^6(\#{\mathcal {G}}_n)\rightarrow 0\).
Since the distribution of \(\max _{\theta \in {\mathcal {G}}_n}|Z_n(\theta )|\) is analytically intractable, Koike [35] proposed a wild bootstrap procedure to approximate it. Formally, let \((w^1_i)_{i=1}^\infty \) and \((w^2_j)_{j=1}^\infty \) be mutually independent sequences of i.i.d. random variables independent of \(X^1\) and \(X^2\). Assume that \(\mathrm {E}[w^1_1]=\mathrm {E}[w^2_1]=0\), \({{\,\mathrm{Var}\,}}[w^1_1]={{\,\mathrm{Var}\,}}[w^2_1]=1\) and \(\Vert w^1_1\Vert _{\psi _2}\vee \Vert w^2_1\Vert _{\psi _2}<\infty \). Define the bootstrapped test statistic as \(T_n^*=\sqrt{n}\max _{\theta \in {\mathcal {G}}_n}|U_n^*(\theta )|\) where
In [35, Proposition B.8], it is shown that
as \(n\rightarrow \infty \), provided that \(r_n=O(n^{-3/4-\eta })\) and \(\#{\mathcal {G}}_n=O(n^\gamma )\) for some \(\eta ,\gamma >0\) in addition to the assumptions of Proposition 2.4. Our result allows us to relax the condition on \(r_n\) as follows:
Proposition 2.5
Under the assumptions of Proposition 2.4, we have (2.8) as \(n\rightarrow \infty \), provided that \(r_n=O(n^{-1/2-\eta })\) and \(\#{\mathcal {G}}_n=O(n^\gamma )\) for some \(\eta ,\gamma >0\).
4 Chernozhukov–Chetverikov–Kato Theory
In this section we demonstrate a basic scheme of the CCK theory to establish high-dimensional CLTs. One main ingredient of the CCK theory is the following smooth approximation of the maximum function: For each \(\beta >0\), we define the function \(\Phi _\beta :{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) by
Eq.(1) in [16] states that
for any \(x\in {\mathbb {R}}^d\). Therefore, the larger \(\beta \) is, the better \(\Phi _\beta \) approximates the maximum function. The next lemma, which is a summary of [24, Lemmas 5–6], highlights the key properties of this smooth max function:
Lemma 3.1
For any \(\beta >0\), \(m\in {\mathbb {N}}\) and \(C^m\) function \(h:{\mathbb {R}}\rightarrow {\mathbb {R}}\), there is an \({\mathbb {R}}^{\otimes m}\)-valued function \(\Upsilon _{\beta }(x)=(\Upsilon ^{j_1,\dots ,j_m}_\beta (x))_{1\le j_1,\dots ,j_m\le d}\) on \({\mathbb {R}}^d\) satisfying the following conditions:
-
(i)
For any \(x\in {\mathbb {R}}^d\) and \(j_1,\dots ,j_m\in [d]\), we have \( |\partial _{j_1\dots j_m}(h\circ \Phi _\beta )(x)|\le \Upsilon _\beta ^{j_1,\dots ,j_m}(x). \)
-
(ii)
For every \(x\in {\mathbb {R}}^d\), we have
$$\begin{aligned} \sum _{j_1,\dots ,j_m=1}^d\Upsilon _\beta ^{j_1,\dots , j_m}(x) \le c_{m}\max _{1\le k\le m}\beta ^{m-k}\Vert h^{(k)}\Vert _\infty , \end{aligned}$$where \(c_{m}>0\) depends only on m.
-
(iii)
For any \(x,t\in {\mathbb {R}}^d\) and \(j_1,\dots ,j_m\in [d]\), we have
$$\begin{aligned} e^{-8\Vert t\Vert _{\ell _\infty }\beta }\Upsilon _\beta ^{j_1,\dots ,j_m}(x+t)\le \Upsilon _\beta ^{j_1,\dots ,j_m}(x)\le e^{8\Vert t\Vert _{\ell _\infty }\beta }\Upsilon _\beta ^{j_1,\dots ,j_m}(x+t). \end{aligned}$$
Remark 3.1
An explicit expression of the constant \(c_m\) in Lemma 3.1 can be derived from [24, Lemma 5]. In particular, we have \(c_1=1\) and \(c_2=3\).
Another important ingredient of the CCK theory is the so-called anti-concentration inequality. For our purpose, the following one is particularly useful (see [19] for the proof):
Lemma 3.2
(Nazarov’s inequality) If \(\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0\), for any \(x\in {\mathbb {R}}^d\) and \(\varepsilon >0\) we have
These tools enable us to establish the following form of smoothing inequality:
Proposition 3.1
Let \(g_{0}:{\mathbb {R}}\rightarrow [0,1]\) be a measurable function such that \(g_{0}(t)=1\) for \(t\le 0\) and \(g_{0}(t)=0\) for \(t\ge 1\). Also, let \(\varepsilon >0\) and set \(\beta :=\varepsilon ^{-1}\log d\). Suppose that \(\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0\). Then, for any d-dimensional random vector F, we have
where
Proof
This result has been essentially shown in Step 2 in the proof of [18, Lemma 5.1]. \(\square \)
Remark 3.2
Proposition 3.1 can be seen as a special version of more general smoothing inequalities such as [7, Lemma 2.1]. An important feature of bound (3.2) is that the quantity \(\Delta _\varepsilon (F,Z)\) contains only test functions of the form \(x\mapsto g_0(\Phi _\beta (x-y))\) for some \(y\in {\mathbb {R}}^d\). If \(g_0\) is sufficiently smooth, derivatives of such a test function admit good estimates with respect to the dimension d, as seen from Lemma 3.1.
It might be worth mentioning that we can use Proposition 3.1 to derive a bound for the Kolmogorov distance by the Wasserstein distance. Let us recall the definition of the Wasserstein distance.
Definition 3.1
(Wasserstein distance) For d-dimensional random vectors F, G with integrable components, the Wasserstein distance between the laws of F and G is defined by
where \(\mathcal {H}\) denotes the set of all functions \(h:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) such that
Here, \(\Vert \cdot \Vert \) is the usual Euclidian norm on \({\mathbb {R}}^d\).
Corollary 3.1
Under the assumptions of Proposition 3.1, we have
Proof
It suffices to consider the case \({\mathcal {W}}_1(F,Z)>0\). Let us define the function \(g_0:{\mathbb {R}}\rightarrow [0,1]\) by \(g_0(x)=\min \{1,\max \{1-x,0\}\}\), \(x\in {\mathbb {R}}\). Then, for any \(x,x',y\in {\mathbb {R}}^d\) and \(\varepsilon >0\), we have \(|g_0(\varepsilon ^{-1}\Phi _\beta (x-y)-g_0(\varepsilon ^{-1}\Phi _\beta (x'-y)|\le \varepsilon ^{-1}\Vert x-x'\Vert _{\ell _\infty }\) by [13, Lemma A.3], so we obtain \(\Delta _\varepsilon (F,Z)\le \varepsilon ^{-1}{\mathcal {W}}_1(F,Z)\). Now, setting \(\varepsilon =\sqrt{\underline{\sigma }{\mathcal {W}}_1(F,Z)/(2\sqrt{2\log d}+4)}\), we infer the desired result from Proposition 3.1. \(\square \)
When \(d=1\), Corollary 3.1 recovers the standard estimate (cf. Eq.(C.2.6) in [44]). We shall remark that a bound similar to the above (with a slightly different constant) has already appeared in [4, Theorem 3.1].
Remark 3.3
It is generally impossible to derive (1.2)-type bounds from the corresponding ones for the Wasserstein distance. To see this, let \(F=(F_1,\dots ,F_d)\) be a d-dimensional random vector such that the laws of \(F_1,\dots ,F_d\) are identical (and integrable). Also, let \(G=(G_1,\dots ,G_d)\) be another d-dimensional random vector satisfying the same condition. Then, we can easily verify \({\mathcal {W}}_1(F,G)\ge \sqrt{d}{\mathcal {W}}_1(F_1,G_1)\) by definition.
5 Stein Kernels and High-Dimensional CLTs
In the rest of the paper, we fix a \(C^\infty \) function \(g_0:{\mathbb {R}}\rightarrow [0,1]\) such that \(g_0(t)=1\) for \(t\le 0\) and \(g_0(t)=0\) for \(t\ge 1\): For example, we can take it as \(g_0(t)=f_0(1-t)/\{f_0(t)+f_0(1-t)\}\), where the function \(f_0:{\mathbb {R}}\rightarrow {\mathbb {R}}\) is defined by \(f_0(t)=e^{-1/t}\) if \(t>0\) and \(f_0(t)=0\) otherwise.
To make Proposition 3.1 useful, we need to obtain a “good” upper bound for the quantity \(\Delta _\varepsilon (F,Z)\). As briefly mentioned in Remark 2.4, Chernozhukov et al. [13] have pointed out that Stein’s method effectively solves this task. Moreover, discussions in [16, 35] implicitly suggest that the CCK theory would have a nice connection to Stein kernels. In this section, we illustrate this idea.
Definition 4.1
(Stein kernel) Let \(F=(F_1,\dots ,F_d)\) be a centered d-dimensional random variable. A \(d\times d\) matrix-valued measurable function \(\tau _F=(\tau _F^{ij})_{1\le i,j\le d}\) on \({\mathbb {R}}^d\) is called a Stein kernel for (the law of) F if \(\max _{1\le i,j\le d}\mathrm {E}[|\tau _F^{ij}(F)|]<\infty \) and
for any \(\varphi \in C^\infty _b({\mathbb {R}}^d)\).
Remark 4.1
In this paper, we adopt \(C^\infty _b({\mathbb {R}}^d)\) as the class of test functions for which identity (4.1) holds true because of convenience, but other classes are also used in the literature; see [20] for instance.
Lemma 4.1
Let \(F=(F_1,\dots ,F_d)\) be a centered d-dimensional random vector. Also, let \(\tau _F=(\tau _F^{ij})_{1\le i,j\le d}\) be a Stein kernel for F. Then, we have
for any \(\beta >0\) and \(h\in C^\infty _b({\mathbb {R}})\), where
Proof
The proof is essentially same as that of [16, Theorem 1] or [35, Proposition 2.1], so we omit it. \(\square \)
Proposition 4.1
Suppose that \(d\ge 2\) and \(\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0\). Under the assumptions of Lemma 4.1, there is a universal constant \(C>0\) such that
Proof
Thanks to [44, Lemma 4.1.3], it suffices to consider the case \(\Delta >0\). By Lemma 4.1, for any \(\varepsilon >0\) we have \(\Delta _\varepsilon (F,Z)\le C'\varepsilon ^{-2}(\log d)\Delta \), where \(C'>0\) is a universal constant. Therefore, Proposition 3.1 yields
Now, setting \(\varepsilon =\Delta ^{1/3}(\log d)^{1/6}\), we obtain the desired result. \(\square \)
6 A High-Dimensional CLT for Normal-Gamma Homogeneous Sums
In view of the results in Sect. 4, we naturally seek a situation where a vector of homogeneous sums has a Stein kernel. This is the case when all the components are eigenfunctions of a Markov diffusion operator (cf. Proposition 5.1 in [39]). Moreover, as clarified in [1, 9, 38], only some spectral properties of the Markov diffusion operator are essential for deriving a fourth-moment-type bound for the variance of the corresponding Stein kernel. This spectral property is especially satisfied when each \(X_i\) is either a Gaussian or (standardized) gamma variable, so this section focuses on such a situation and derives a high-dimensional CLT for this special case.
For each \(\nu >0\), we denote by \(\gamma _\pm (\nu )\) the distribution of the random variable \(\pm (X-\nu )/\sqrt{\nu }\) with \(X\sim \gamma (\nu )\). Also, for every \(q\in {\mathbb {N}}\) we set \( \mathfrak {c}_q:=\sum _{r=1}^qr!\left( {\begin{array}{c}q\\ r\end{array}}\right) ^2. \)
Proposition 5.1
Let us keep the same notation as in Theorem 2.1 and assume \(d\ge 2\). Let \(\varvec{Y}=(Y_i)_{i=1}^N\) be a sequence of independent random variables such that the law of \(Y_i\) belongs to \(\{{\mathcal {N}}(0,1)\}\cup \{\gamma _+(\nu ):\nu>0\}\cup \{\gamma _-(\nu ):\nu >0\}\) for all i. For every i, define the constants \(v_i\) and \(\eta _i\) by
We also set \(w_*=1/2\) if \(Y_i\sim {\mathcal {N}}(0,1)\) for all i and \(w_*=1\) otherwise. Then, \(\kappa _4(Q(f_j;\varvec{Y}))\ge 0\) for all j and
for any \(\beta >0\) and \(h\in C^\infty _b({\mathbb {R}})\), where \(C>0\) depends only on \(\overline{q}_d\) and
with \(\overline{v}_N:=\max _{1\le i\le N}v_i\) and \(\underline{\eta }_N:=\min _{1\le i\le N}\eta _i\).
The rest of this section is devoted to the proof of Proposition 5.1. In the remainder of this section, we assume that the probability space \((\Omega ,{\mathcal {F}},P)\) is given by the product probability space \((\prod _{i=1}^N\Omega _i,\bigotimes _{i=1}^N{\mathcal {F}}_i,\bigotimes _{i=1}^NP_i)\), where
Then, we realize the variables \(Y_1,\dots ,Y_N\) as follows: For \(\omega =(\omega _1,\dots ,\omega _N)\in \Omega \), we define
6.1 \(\Gamma \)-Calculus
Our first aim is to construct a suitable Markov diffusion operator whose eigenspaces contain all the components of \(\varvec{Q}(\varvec{Y})\). In the following, for an open subset U of \({\mathbb {R}}^m\), we write \(C^\infty _p(U)\) for the set of all real-valued \(C^\infty \) functions on U all of whose partial derivatives have at most polynomial growth.
First, we denote by \({{\,\mathrm{L}\,}}_\text {OU}\) the Ornstein–Uhlenbeck operator on \({\mathbb {R}}\). Next, for every \(\nu >0\), we write \({{\,\mathrm{L}\,}}_\nu \) for the Laguerre operator on \((0,\infty )\) with parameter \(\nu \). We then define the operators \({\mathcal {L}}_1,\dots ,{\mathcal {L}}_N\) by
Finally, we construct the densely defined symmetric operator \({{\,\mathrm{L}\,}}\) in \(L^2(P)\) by tensorization of \({\mathcal {L}}_1,\dots ,{\mathcal {L}}_N\) (see Section 2.2 of [2] for details). We will use the following properties of \({{\,\mathrm{L}\,}}\) (cf. [1] and Section 2.2 of [2]):
-
(i)
If F and G are eigenfunctions of \(-{{\,\mathrm{L}\,}}\) associated with eigenvalues p and q, respectively, FG belongs to \(\bigoplus _{k=0}^{p+q}{{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}+k{{\,\mathrm{Id}\,}})\).
-
(ii)
The eigenspaces of \({{\,\mathrm{L}\,}}_\text {OU}\) and \({{\,\mathrm{L}\,}}_\nu \) associated with eigenvalue \(k\in {\mathbb {Z}}_+\) are given by \({{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}_\text {OU}+k{{\,\mathrm{Id}\,}})=\{aH_k:a\in {\mathbb {R}}\}\) and \({{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}_\nu +k{{\,\mathrm{Id}\,}})=\{aL^{(\nu -1)}_k:a\in {\mathbb {R}}\}\), respectively. Here, \(H_k\) and \(L_k^{(\alpha )}\) denote the Hermite polynomial of degree k and Laguerre polynomial of degree k and parameter \(\alpha >-1\), respectively.
-
(iii)
The eigenspace of \({{\,\mathrm{L}\,}}\) associated with eigenvalue k is given by
$$\begin{aligned} {{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}+k{{\,\mathrm{Id}\,}})=\bigoplus _{\begin{array}{c} k_1+\cdots +k_N=k\\ k_1,\dots ,k_N\in {\mathbb {Z}}_+ \end{array}} {{\,\mathrm{Ker}\,}}({\mathcal {L}}_1+k_1{{\,\mathrm{Id}\,}})\otimes \cdots \otimes {{\,\mathrm{Ker}\,}}({\mathcal {L}}_N+k_N{{\,\mathrm{Id}\,}}).\nonumber \\ \end{aligned}$$(5.2)
Let us write \({\mathcal {S}}=C^\infty _p(\Omega )\). We define the carré du champ operator of \({{\,\mathrm{L}\,}}\) by
for all \(F,G\in {\mathcal {S}}\). The following lemma is a special case of [39, Proposition 5.1].
Lemma 5.1
For every \((i,j)\in [d]^2\), define the function \(\tau ^{ij}:{\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\otimes {\mathbb {R}}^d\) by
Then, \(\tau =(\tau ^{ij})_{1\le i,j\le d}\) is a Stein kernel for \(\varvec{Q}(\varvec{Y})\).
We refer to [5] for more details about these operators.
6.2 A Bound for the Variance of the Carré du Champ Operator
In view of Lemmas 4.1 and 5.1 , we obtain (5.1) once we show that
where \(C>0\) depends only on \(\overline{q}_d\). As a first step, we estimate \({{\,\mathrm{Var}\,}}[\Gamma ( Q(f_j;\varvec{Y}),Q(f_k;\varvec{Y}))]\) for every \((j,k)\in [d]^2\). More precisely, our aim here is to prove the following result:
Proposition 5.2
Let \(p\le q\) be two positive integers. Let \(f:[N]^p\rightarrow {\mathbb {R}}\) and \(g:[N]^q\rightarrow {\mathbb {R}}\) be symmetric functions vanishing on diagonals and set \(F:=Q(f;\varvec{Y})\) and \(G:=Q(g;\varvec{Y})\). Then, \(\kappa _4(F)\ge 0\), \(\kappa _4(G)\ge 0\) and
Before starting the proof, we remark how this result is related to the preceding studies. When \(f=g\), Azmoodeh et al. [1] have derived a better estimate than (5.4) in a more general setting. Their technique of the proof can also be applied to the case \(f\ne g\), and this has been implemented in Campese et al. [9]. However, this leads to a bound containing the quantity \({{\,\mathrm{Cov}\,}}[F^2,G^2]-2{{\,\mathrm{E}\,}}\left[ FG\right] ^2\), so we need an additional argument to estimate it. For this reason, we take an alternative route for the proof, which is inspired by the discussions in Zheng [59] as well as [9, Proposition 3.6]. As a byproduct of this strategy, we obtain inequality (5.11) which leads to the universality of gamma variables.
We begin by introducing some notation. We write \(J_k\) for the orthogonal projection of \(L^2(P)\) onto the eigenspace \({{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}+k{{\,\mathrm{Id}\,}})\). For every i, we define the random variable \({\mathfrak {p}}_{2}(Y_i)\) by
The following lemma can be proved by a straightforward computation.
Lemma 5.2
For every i, \(\mathrm {E}[{\mathfrak {p}}_2(Y_i)]=\mathrm {E}[Y_i{\mathfrak {p}}_2(Y_i)]=0\) and \(\mathrm {E}[\mathfrak {p}_2(Y_i)^2]=v_i\).
Next, given \(h':[N]^r\rightarrow {\mathbb {R}}\), we define
Note that \(\Vert h\Vert _{\ell _2}^2=\langle h,h\rangle \). For every \(r\in \{0,1,\dots ,p\wedge q\}\), we define the function \(f\mathbin {\widehat{\star _{r}^0}}g:[N]^{p+q-r}\rightarrow {\mathbb {R}}\) by
Note that we have
where \(\widetilde{f\star _rg}\) is the symmetrization of \(f\star _rg\) (recall (2.3)). Finally, we set \(\Delta _q^N:=\{(i_1,\dots ,i_q)\in [N]^q:i_j\ne i_k\text { if }j\ne k\}\).
The next lemma is a key part in our proof.
Lemma 5.3
Under the assumptions of Proposition 5.1, we have
Moreover, if \(p=q\), we have
Proof
We can deduce from [48, Proposition 2.9] and (5.5)
Next, let \((i_1,\dots ,i_{p+q-r})\in \Delta ^N_{p+q-r}\) and \((j_1,\dots ,j_{p+q-l})\in \Delta ^N_{p+q-l}\). Thanks to Lemma 5.2,
does not vanish if and only if the following condition is satisfied:
-
(\(\star \)) \((i_1,\dots ,i_{p+q-2r})\) is a permutation of \((j_1,\dots ,j_{p+q-2l})\) and \((i_{p+q-2r+1},\dots ,i_{p+q-r})\) is a permutation of \((j_{p+q-2l+1},\dots ,j_{p+q-l})\).
Note that the condition (\(\star \)) can hold true only if \(r=l\). Moreover, if the condition (\(\star \)) is satisfied, we have
by Lemma 5.2. Since there are totally \((p+q-2r)!\) permutations of \((i_1,\dots ,i_{p+q-2r})\) and r! permutations of \((i_{p+q-2r+1},\dots ,i_{p+q-r})\), respectively, (5.8) yields
Now, (5.9) is especially true when all the \(Y_i\)’s follow the standard normal distribution. Therefore, the product formula for multiple integrals with respect to an isonormal Gaussian process yields
where \(f\mathbin {{\widetilde{\otimes }}}g\) is the symmetrization of \(f\otimes g\). Combining this formula with (5.9), we obtain (5.6).
Next, we prove (5.7). Similar arguments to the above yield
and
Combining these results with the Schwarz inequality, we infer that
The desired result now follows from Eq.(50) in [27] and Hölder’s inequality. \(\square \)
Lemma 5.4
Under the assumptions of Proposition 5.1, we have
and
Proof
The proof is parallel to that of [59, Lemma 3.1] but using Lemma 5.3 instead of [59, Lemma 2.1]. \(\square \)
Proof of Proposition 5.2
The nonnegativity of \(\kappa _4(F)\) and \(\kappa _4(G)\) follows from (5.11). (5.4) can be shown in a similar manner to the proof of [59, Theorem 1.1] but using Lemmas 5.3 and 5.4 instead of Lemmas 2.1 and 3.1 in [59], respectively. \(\square \)
6.3 Proof of Proposition 5.1
We have already established the nonnegativity of \(\kappa _4(Q(f_j;\varvec{Y}))\)’s in Proposition 5.2. The remaining claim of the proposition follows once we prove (5.3). By Proposition 5.2 and Lemma A.2, this follows once we show that, under the assumptions of Proposition 5.1,
where \(C_{p,q}>0\) depends only on p, q. To prove (5.12), note that
Hence, using Lemma A.5, we can deduce (5.12) by a hypercontractivity argument similar to those in [33, Section 5] and [41, Section 3.2]. \(\square \)
7 Randomized Lindeberg Method
For any \(\varpi \ge 0\) and \(x\ge 0\), we set
The aim of this section is to prove the following result.
Proposition 6.1
Set \(\Lambda _i:=(\log d)^{(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}M_N^{q_k-1}\sqrt{{{\,\mathrm{Inf}\,}}_{i}(f_k)}\) for \(i\in [N]\). Let \(\varvec{X}=(X_i)_{i=1}^N\) and \(\varvec{Y}=(Y_i)_{i=1}^N\) be two sequences of independent centered random variables with unit variance. Suppose that \(M_N:=\max _{1\le i\le N}(\Vert X_i\Vert _{\psi _\alpha }\vee \Vert Y_i\Vert _{\psi _\alpha })<\infty \) for some \(\alpha \in (0,2]\). Suppose also that there is an integer \(m\ge 3\) such that \(\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]\) for all \(i\in [N]\) and \(r\in [m-1]\). Then, for any \(h\in C^m_b({\mathbb {R}})\), \(\beta >0\) and \(\tau ,\rho \ge 0\) with \(\tau \rho M_N\max _{1\le i\le N}\Lambda _i\le \beta ^{-1}\), we have
where \(C>0\) depends only on \(m,\alpha ,\overline{q}_d\), \(K_1\) depends only on \(\alpha \), and \(K_2,K_3>0\) depend only on \(\alpha ,\overline{q}_d\).
Remark 6.1
Proposition 6.1 can be viewed as a version of [47, Theorem 7.1]. Apart from that we take account of higher moment matching, there are important differences between these two results. On the one hand, the latter takes all \(C^3\) functions with bounded third-order partial derivatives as test functions, while the former focuses only on test functions of the form \(x\mapsto h(\Phi _\beta (x-y))\) for some \(h\in C^m_b({\mathbb {R}})\) and \(y\in {\mathbb {R}}^d\). On the other hand, in the bound of (6.1), terms like \(\sum _{i=1}^N\max _{1\le k\le d}{{\,\mathrm{Inf}\,}}_i(f_k)\) always appear with exponential factors, so we can remove such terms by appropriately selecting the parameters \(\tau ,\rho \). In contrast, such a quantity appears (as the constant C) in the dominant term of the bound given by [47, Theorem 7.1]. As pointed out in Remark 2.4(b), this can be crucial in a high-dimensional setting, and this phenomenon originates from a (naïve) application of the Lindeberg method. To avoid this difficulty, we use a randomized version of the Lindeberg method, which was originally introduced in [24] for sums of independent random vectors.
For the proof, we need three auxiliary results. The first one is a generalization of [36, Lemma S.5.1]:
Lemma 6.1
Let \(\xi \) be a nonnegative random variable such that \(P(\xi >x)\le Ae^{-(x/B)^\alpha }\) for all \(x\ge 0\) and some constants \(A,B,\alpha >0\). Then, we have
for any \(p>\alpha \) and \(t>0\).
Proof
The proof is analogous to that of [36, Lemma S.5.1] and elementary, so we omit it. \(\square \)
The second one is a moment inequality for homogeneous sums with a sharp constant:
Lemma 6.2
Let \(\varvec{X}=(X_i)_{i=1}^N\) be a sequence of independent centered random variables. Suppose that \(M:=\max _{1\le i\le N}\Vert X_i\Vert _{\psi _\alpha }<\infty \) for some \(\alpha \in (0,2]\). Also, let \(q\in {\mathbb {N}}\) and \(f:[N]^q\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals. Then,
for any \(p\ge 2\), where \(K_{\alpha ,q}>0\) depends only on \(\alpha ,q\).
Since we need additional lemmas to prove Lemma 6.2, we postpone its proof to Appendix B.
The third one is well known and immediately follows from the commutativity of addition, but it will deserve to be explicitly stated for later reference.
Lemma 6.3
Let S be a finite set and \(\varphi \) be a real-valued function on S. Also, let \(b:S\rightarrow S\) be a bijection. Then, \(\sum _{x\in A}\varphi (b(x))=\sum _{x\in b(A)}\varphi (x)\) for any \(A\subset S\).
Now we turn to the main body of the proof. Throughout the proof, we will use the standard multi-index notation. For a multi-index \(\lambda =(\lambda _1,\dots ,\lambda _d)\in {\mathbb {Z}}_+^d\), we set \(|\lambda |:=\lambda _1+\cdots +\lambda _d\), \(\lambda !:=\lambda _1!\cdots \lambda _d!\) and \(\partial ^\lambda :=\partial _1^{\lambda _1}\cdots \partial _d^{\lambda _d}\) as usual. Also, given a vector \(x=(x_1,\dots ,x_d)\in {\mathbb {R}}^d\), we write \(x^\lambda =x_1^{\lambda _1}\cdots x_d^{\lambda _d}\).
Proof of Proposition 6.1
Without loss of generality, we may assume \(\varvec{X}\) and \(\varvec{Y}\) are independent. Throughout the proof, for two real numbers a and b, the notation \(a\lesssim b\) means that \(a\le cb\) for some constant \(c>0\) which depends only on \(m,\alpha ,\overline{q}_d\).
Take a vector \(y\in {\mathbb {R}}^d\) and define the function \(\Psi :{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) by \(\Psi (x)=h(\Phi _\beta (x-y))\) for \(x\in {\mathbb {R}}^d\). For any \(i\in [N]\), \(\sigma \in {\mathfrak {S}}_N\) and \(k\in [d]\), we define
and
Then, we set \(\varvec{U}^\sigma _{i}=(U_{k,i}^\sigma )_{k=1}^d\) and \(\varvec{V}^\sigma _{i}=(V^\sigma _{k,i})_{k=1}^d\). By construction, \(\varvec{U}^\sigma _{i}\) and \(\varvec{V}^\sigma _{i}\) are independent of \(X_{\sigma (i)}\) and \(Y_{\sigma (i)}\). Moreover, we have \(Q(f_k;\varvec{W}^\sigma _{i-1})=U^\sigma _{k,i}+Y_{\sigma (i)}V^\sigma _{k,i}\) and \(Q(f_k;\varvec{W}^\sigma _{i})=U^\sigma _{k,i}+X_{\sigma (i)}V^\sigma _{k,i}\) (with \(\varvec{W}^\sigma _0:=(Y_{\sigma (1)},\dots ,Y_{\sigma (N)})\)). In particular, by Lemma 6.3 it holds that \(Q(f_k;\varvec{W}^\sigma _{0})=Q(f_k;\varvec{Y})\) and \(Q(f_k;\varvec{W}^\sigma _{N})=Q(f_k;\varvec{X})\). Therefore, we obtain
Taylor’s theorem and the independence of \(X_{\sigma (i)}\) and \(Y_{\sigma (i)}\) from \(\varvec{U}_{i}^\sigma \) and \(\varvec{V}_{i}^\sigma \) yield
for \(\xi \in \{X_{\sigma (i)},Y_{\sigma (i)}\}\), where
Since \(\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]\) for all \(i\in [N]\) and \(r\in [m-1]\) by assumption, we obtain
where \({\mathbf {I}}_i^\sigma :={\mathbf {I}}_i^\sigma [X_{\sigma (i)}]+{\mathbf {I}}_i^\sigma [Y_{\sigma (i)}]\), \(\mathbf {II}_i^\sigma :=\mathbf {II}_i^\sigma [X_{\sigma (i)}]+\mathbf {II}_i^\sigma [Y_{\sigma (i)}]\) and
for \(\xi \in \{X_{\sigma (i)},Y_{\sigma (i)}\}\) and \({\mathcal {E}}_{\sigma ,i}:=\{(|X_{\sigma (i)}|+|Y_{\sigma (i)}|)\Vert \varvec{V}^\sigma _i\Vert _{\ell _\infty }\le \tau \rho M_N\Lambda _{\sigma (i)}\}\).
First, we consider \({\mathbf {I}}_i^\sigma \). Since \(\tau \rho M_N\max _{1\le i\le N}\Lambda _i\le \beta ^{-1}\) by assumption, Lemma 3.1 and the independence of \(X_{\sigma (i)},Y_{\sigma (i)}\) from \(\varvec{U}^\sigma _i,\varvec{V}^\sigma _i\) imply that
where
and \({\mathcal {C}}_{\sigma ,i}:=\{|X_{\sigma (i)}|+|Y_{\sigma (i)}|\le \tau M_N\},{\mathcal {D}}_{\sigma ,i}:=\{\Vert \varvec{V}^\sigma _i\Vert _{\ell _\infty }\le \rho \Lambda _{\sigma (i)}\}\).
We begin by estimating \({\mathbf {I}}(1)_i^\sigma \). Let \((\delta _i)_{i=1}^N\) be a sequence of i.i.d. Bernoulli variables independent of \(\varvec{X}\) and \(\varvec{Y}\) with \(P(\delta _i=1)=1-P(\delta _i=0)=i/(N+1)\). We set \(\zeta _{i,a}:=\delta _i X_a+(1-\delta _i)Y_a\) for all \(i,a\in [N]\). Then, since \(\Vert \zeta _{i,\sigma (i)}\varvec{V}^\sigma _i\Vert _{\ell _\infty }\le \tau \rho M_N\max _{1\le i\le N}\Lambda _i\le \beta ^{-1}\) on the set \({\mathcal {C}}_{\sigma ,i}\cap {\mathcal {D}}_{\sigma ,i}\), by Lemma 3.1 we obtain
The subsequent discussions are inspired by the proof of [24, Lemma 2] and we introduce some notation analogous to theirs. For any \(i,a\in [N]\), we set
where \(\#S\) denotes the number of elements in a set S. We also set
for every \(i\in \{0,1\dots ,N\}\). Moreover, for any \(A,B\subset [N]\) with \(A\cap B=\emptyset \) and \(i\in A\cup B\), we define the random variable \(W^{(A,B)}_i\) by \(W^{(A,B)}_i:=X_i\) if \(i\in A\) and \(W^{(A,B)}_i:=Y_i\) if \(i\in B\). Then, we define
for any \(k\in [d]\) and \((A,B)\in \bigcup _{i=0}^\infty \mathcal {A}_i\), and set \(\varvec{Q}^{(A,B)}:=(Q_k^{(A,B)})_{k=1}^d\). We also define
for any \(k\in [d]\), \(i,a\in [N]\) and \((A,B)\in \bigcup _{j=1}^N{\mathcal {A}}_{j,a}\) and set \(\varvec{U}_a^{(A,B)}:=(U_{k,a}^{(A,B)})_{k=1}^d\) and \(\varvec{V}_a^{(A,B)}:=(V_{k,a}^{(A,B)})_{k=1}^d\). Finally, for any \(\sigma \in {\mathfrak {S}}_N\) and \(i\in [N]\) we set \(A^\sigma _i:=\{\sigma (1),\dots ,\sigma (i-1)\}\) and \(B^\sigma _i:=\{\sigma (i+1),\dots ,\sigma (N)\}\).
Now, since we have \(W^\sigma _{i,j}=W^{(A^\sigma _i,B^\sigma _i)}_{\sigma (j)}\) for \(j\in [N]\setminus \{i\}\), it holds that \(\varvec{U}^\sigma _{i}=\varvec{U}^{(A^\sigma _i,B^\sigma _i)}_{\sigma (i)}\) and \(\varvec{V}^\sigma _{i}=\varvec{V}^{(A^\sigma _i,B^\sigma _i)}_{\sigma (i)}\) by Lemma 6.3. Therefore, we obtain
Now, for \((A,B)\in {\mathcal {A}}_{i,a}\) we have \(\varvec{U}^{(A,B)}_{a}+X_a\varvec{V}^{(A,B)}_{a}=\varvec{Q}^{(A\cup \{a\},B)}\) and \(\varvec{U}^{(A,B)}_{a}+Y_a\varvec{V}^{(A,B)}_{a}=\varvec{Q}^{(A,B\cup \{a\})}\), so we obtain
Hence, Lemma 3.1 yields
Now, by Lemma 6.2 we have
for any \(r\ge 1\), where \(C_{\alpha ,\overline{q}_d}>0\) depends only on \(\alpha ,\overline{q}_d\). Hence, the Minkowski inequality yields
Thus, if \(\overline{q}_d>1\), Lemma A.5 yields
Therefore, by Lemmas A.2 and A.6 we conclude that
This inequality also holds true when \(\overline{q}_d=1\) because in this case \(V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}\)’s are non-random and thus it is a direct consequence of (6.6). As a result, we obtain
Next, we estimate \({\mathbf {I}}(2)^\sigma _i\). Since \(X_{\sigma (i)}\) and \(Y_{\sigma (i)}\) are independent of \(\varvec{U}^\sigma _i\) and \(\varvec{V}^\sigma _i\),
Hence, Lemma 3.1 yields
Now, if \(\overline{q}_d>1\), (6.5) and Lemma A.5 yield
where \(c_{\alpha ,\overline{q}_d}>0\) depends only on \(\alpha ,\overline{q}_d\). Hence, Lemmas A.2 and A.6 yield
for every \(r\ge 1\) with \(c'_{\alpha ,\overline{q}_d}>0\) depending only on \(\alpha ,\overline{q}_d\). This inequality also holds true when \(\overline{q}_d=1\) because in this case \(\varvec{V}^\sigma _{i}\) is non-random and thus it is a direct consequence of (6.5). Meanwhile, (A.3) and Lemma A.3 yield \( P({\mathcal {C}}_{\sigma ,i}^c)\le 2e^{-(\tau /2^{1\vee \alpha ^{-1}})^{\alpha }}. \) Consequently, we obtain
Third, we estimate \({\mathbf {I}}(3)_i^\sigma \). Lemma 3.1 yields
If \(\overline{q}_d>1\), (6.8) and Lemma A.4 yield
for every \(x>0\) with \(K_{\alpha ,\overline{q}_d}>0\) depending only on \(\alpha ,\overline{q}_d\). Hence, Lemma 6.1 yields
Meanwhile, if \(\overline{q}_d=1\), \(\varvec{V}^\sigma _{i}\) is non-random, so (6.8) yields \( {{\,\mathrm{E}\,}}\left[ \Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }^m;{\mathcal {D}}_{\sigma ,i}^c\right] \lesssim \Lambda _{\sigma (i)}^m1_{\{c'_{\alpha ,\overline{q}_d}>\rho \}}. \) Consequently, setting \(K_{\alpha ,\overline{q}_d}':=K_{\alpha ,\overline{q}_d}\vee c'_{\alpha ,\overline{q}_d}\), we obtain
Now, combining (6.4), (6.7), (6.9), (6.10) with Lemma A.6, we obtain
Next, we consider \(\mathbf {II}_i^\sigma \). Lemma 3.1 yields
Since \(X_{\sigma (i)}\) and \(Y_{\sigma (i)}\) are independent of \(\varvec{V}^\sigma _{i}\), Lemma A.6 and (6.8) imply that
for every \(r\ge 1\) with \(L_{\alpha ,\overline{q}_d}>0\) depending only on \(\alpha ,\overline{q}_d\). Thus, by Lemma A.4 we obtain
for every \(x>0\) with \(L'_{\alpha ,\overline{q}_d}>0\) depending only on \(\alpha ,\overline{q}_d\). So Lemma 6.1 yields
Combining this inequality with (6.2), (6.3) and (6.11), we complete the proof. \(\square \)
8 Proof of the Main Results
8.1 Proof of Theorem 2.1
The following result is a version of [47, Lemma 4.3]. The proof is a minor modification of the latter’s, so we omit it.
Lemma 7.1
Let \(q\in {\mathbb {N}}\) and \(f:[N]^q\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals. Also, let \(\varvec{X}=(X_i)_{i=1}^N\) and \(\varvec{Y}=(Y_i)_{i=1}^N\) be two sequences of independent centered random variables with unit variance. Suppose that there are integers \(3\le m\le l\) such that \(M_N:=\max _{1\le i\le N}(\Vert X_i\Vert _{l}\vee \Vert Y_i\Vert _{l})<\infty \) and \(\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]\) for all \(i\in [N]\) and \(r\in [m-1]\). Then, we have \(Q(f;\varvec{X}),Q(f;\varvec{Y})\in L^l(P)\) and
where \(C>0\) depends only on q, l.
Proof of Theorem 2.1
Throughout the proof, for two real numbers a and b, the notation \(a\lesssim b\) means that \(a\le cb\) for some constant \(c>0\) which depends only on \(\alpha ,\overline{q}_d\). Moreover, if \((\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}\ge 1\), then the claim evidently holds true with \(C=1\), so we may assume \((\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}<1\).
Set \(s_i:=\mathrm {E}[X_i^3]\) for every i. We take a sequence \(\varvec{Y}=(Y_i)_{i=1}^N\) of independent random variables such that
By construction, we have \(\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]\) for any \(i\in [N]\) and \(r\in [3]\). Moreover, one can easily check that \(\Vert Y_i\Vert _r\le \overline{B}_N(r-1)^w\) for any \(i\in [N]\) and \(r\ge 2\). Hence, by Lemma A.5 we have \(\max _{1\le i\le N}\Vert Y_i\Vert _{\psi _\alpha }\le c_\alpha \overline{B}_N\) with \(c_\alpha \ge 1\) depending only on \(\alpha \). Therefore, applying Proposition 6.1 with \(m=4\), we obtain
for any \(\varepsilon >0\) and \(\tau ,\rho \ge 0\) with \(\tau \rho c_\alpha \overline{B}_N\max _{1\le i\le N}\Lambda _i\le \varepsilon /\log d\), where \(C_1,K_1,K_2,K_3>0\) depend only on \(\alpha ,\overline{q}_d\), and \(\Lambda _i:=(\log d)^{(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}\overline{B}_N^{q_k-1}\sqrt{{{\,\mathrm{Inf}\,}}_{i}(f_k)}\). We apply this inequality with \(\tau :=(\log d^2)^{1/\alpha }\{K_1\vee (K_3/K_2)\}\), \(\rho :=(\log d^2)^{(\overline{q}_d-1)/\alpha }K_2\) and
By construction, we have
Therefore, we obtain
Since \(3+4\{(\overline{q}_d-1)/\alpha -\mu \}\le \frac{4}{3\alpha }(\overline{q}_d-1)+\frac{5}{3}\le 2\mu +1\) and \((\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}<1\), we conclude that
Meanwhile, Proposition 5.1 yields
Now, in the present situation, the constants \(w_*\), \(\overline{v}_N\) and \(\overline{\eta }_N\) appearing in Proposition 5.1 satisfy \(w_*=w\), \(\overline{v}_N\le 2+\overline{A}_N^2/2\) and \(\overline{\eta }_N^{-1}\le \overline{A}_N/2\), so we have
Moreover, by a standard hypercontractivity argument, we have \(\Vert Q(f_j;\varvec{Y})\Vert _4\lesssim \overline{A}_N^{q_j}\Vert Q(f_j;\varvec{Y})\Vert _2\) for every j. Also, since Lemma A.5 yields \(\max _{1\le i\le N}\Vert Y_i\Vert _{\psi _\alpha }\lesssim \underline{\eta }_N^{-1}\), by Lemma 7.1 (with \(l=m=4\)) we obtain
for every k. Since we have \(\Vert Q(f_k;\varvec{Y})\Vert _2=\sqrt{q_k!}\Vert f_k\Vert _{\ell _2}=\Vert Q(f_k;\varvec{X})\Vert _2\) for every k, it holds that \( \delta _2[\varvec{Q}(\varvec{Y})]\lesssim (\log d)^{2w\overline{q}_d-1}\delta _1[\varvec{Q}(\varvec{X})]. \) Consequently, we obtain
Since \(2(w\overline{q}_d-\mu )\le \frac{2}{3}w\overline{q}_d+\frac{1}{3}\le \mu +\frac{1}{2}\), we conclude that
Therefore, Proposition 3.1 yields
This completes the proof. \(\square \)
8.2 Proof of Corollaries 2.1 and 2.2
Corollary 2.1 can be shown in an analogous manner to the proof of [18, Corollary 5.1] with applying Theorem 2.1 instead of [18, Lemma 5.1]. Corollary 2.2 immediately follows from Corollary 2.1. \(\square \)
8.3 Proof of Theorem 2.2
Lemma 7.2
Let \(q\ge 2\) and \(f:[N]^q\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals. Suppose that the sequence \(\varvec{X}\) satisfies one of conditions (A)–(C). Then, we have \(\kappa _4(Q(f;\varvec{X}))\ge 0\) and
Proof
The first inequality in (7.3) is a consequence of Eq.(1.9) in [47] (note that they define \({{\,\mathrm{Inf}\,}}_i(f)\) with dividing ours by \((q-1)!\)). To prove the second inequality in (7.3), first we suppose that \(\varvec{X}\) satisfies condition (A). Let \(\varvec{G}=(G_i)_{i\in {\mathbb {N}}}\) be a sequence of independent standard normal variables. Then, by [45, Proposition 3.1] we have \(\kappa _4(Q(f;\varvec{X}))\ge \kappa _4(Q(f;\varvec{G}))\). Therefore, (5.11) yields the desired result. Next, when \(\varvec{X}\) satisfies condition (B), the desired result follows from Eq.(5.3) in [29]. Finally, when \(\varvec{X}\) satisfies condition (C), the desired result follows from (5.11). Hence, we complete the proof. \(\square \)
Lemma 7.3
Let F, G be two random variables such that \(\Vert F\Vert _{\psi _\alpha }\vee \Vert G\Vert _{\psi _\alpha }<\infty \) for some \(\alpha >0\). Then, we have
for any \(r\ge 1\) and \(b\in (0,1)\), where \(\varvec{\Gamma }\) denotes the gamma function.
Proof
This is an easy consequence of [55, Theorem 8.16] and Lemma A.3. \(\square \)
Proof of Theorem 2.2
The inequality \(\kappa _4(Q(f;\varvec{X}))\ge 0\) is proved in Lemma 7.2. The implication (iv) \(\Rightarrow \) (iii) \(\Rightarrow \) (ii) is obvious. The implication (i) \(\Rightarrow \) (iv) follows from Corollary 2.2 and Lemma 7.2.
It remains to prove (ii) \(\Rightarrow \) (i). In view of Lemma 7.3, it is enough to prove \(\sup _{n\in {\mathbb {N}}}\max _{1\le j\le d_n}(\Vert Q(f_{n,j};\varvec{X})\Vert _{\psi _{\beta }}+\Vert Z_{n,j}\Vert _{\psi _{\beta }})<\infty \) for some \(\beta >0\). This follows from Lemmas 6.2 and A.5. \(\square \)
8.4 Proof of Lemma 2.1
Let us define the sequence of random variables \((Y_i)_{i=1}^N\) in the same way as in the proof of Theorem 2.1. Then, since \(\mathrm {E}[Y_i^4]=3+\frac{3}{2}|\mathrm {E}[X_i^3]|\le \frac{9}{2}M\), Lemma 7.1 yields
where \(C_1>0\) depends only on q. Now, since \(\mathrm {E}[Q(f;\varvec{X})^2]=q!\Vert f\Vert _{\ell _2}^2=\mathrm {E}[Q(f;\varvec{Y})^2]\) and \(\sqrt{\kappa _4(Q(f;\varvec{Y}))}\ge q\cdot q!\max _{1\le r\le q-1}\Vert f\star _rf\Vert _{\ell _2}\) by Lemma 7.2, we obtain the desired result. \(\square \)
8.5 Proof of Proposition 2.3
Lemma 7.4
Let \(\varvec{X}=(X_i)_{i=1}^N\) be a sequence of independent centered random variables with unit variance and such that \(M:=\max _{1\le i\le N}\mathrm {E}[X_i^4]<\infty \). Also, let \(f:[N]^2\rightarrow {\mathbb {R}}\) be a symmetric function vanishing on diagonals. Then, we have
where \(C>0\) is a universal constant.
Proof
By Proposition 3.1 and Eq.(3.1) in [21], we have
where
Since we have \(2\Vert f\Vert _{\ell _2}^4-G_{\text {V}}\le 8\Vert f\Vert _{\ell _2}^2\mathcal {M}(f)\), it holds that \(|\kappa _4(Q(f;\varvec{X}))|\le |\mathrm {E}[Q(f;\varvec{X})^4]-6G_{\text {V}}|+48\Vert f\Vert _{\ell _2}^2\mathcal {M}(f)\). Meanwhile, a straightforward computation yields
Hence, we obtain
where \(C_1>0\) is a universal constant. Since it holds that
we obtain the desired result. \(\square \)
Proof of Proposition 2.3
The desired result immediately follows from Corollary 2.2 and Lemma 7.4. \(\square \)
8.6 Proof of Proposition 2.5
Define the \(n_1\times n_2\) matrix \(\Xi _n(\theta )\) by \(\Xi _n(\theta )= (\frac{1}{2}\Delta _i^nX^1K^{ij}_\theta \Delta _j^nX^2)_{i,j},\) and set
Note that \(U_n^*(\theta )={\varvec{w}}^\top {\widetilde{\Xi }}_n(\theta ){\varvec{w}}\) with \({\varvec{w}}=((w^1_i)_{i=1}^{n_1},(w^2_j)_{j=1}^{n_2})^\top \). Hence, by Proposition 2.3, it suffices to prove
(7.4) and (7.5) are established in the proof of [35, Proposition B.8] under the current assumptions. Moreover, as in bounding the quantity \(\mathrm {E}[R_{n,1}^*]\) in the proof of [35, Proposition B.8], we deduce for any \(p\ge 1\)
Exchanging \(X^1\) and \(X^2\), we obtain a similar estimate. Hence, (7.6) holds by assumption. \(\square \)
Availability of data and materials
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
References
Azmoodeh, E., Campese, S., Poly, G.: Fourth Moment Theorems for Markov diffusion generators. J. Funct. Anal. 266, 2341–2359 (2014)
Azmoodeh, E., Malicet, D., Mijoule, G., Poly, G.: Generalization of the Nualart–Peccati criterion. Ann. Probab. 44, 924–954 (2016)
Azmoodeh, E., Peccati, G.: Malliavin–Stein method: a survey of recent developments. Working paper. arXiv:1809.01912 (2018)
Azmoodeh, E., Peccati, G., Poly, G.: The law of iterated logarithm for subordinated Gaussian sequences: uniform Wasserstein bounds. ALEA Lat. Am. J. Probab. Math. Stat. 13, 659–686 (2016)
Bakry, D., Gentil, I., Ledoux, M.: Analysis and Geometry of Markov Diffusion Operators. Springer, Berlin (2014)
Belloni, A., Chernozhukov, V., Chetverikov, D., Hansen, C., Kato, K.: High-dimensional econometrics and regularized GMM. Working paper. (2018) arXiv:1806.01888
Bentkus, V.: On the dependence of the Berry–Esseen bound on dimension. J. Statist. Plann. Inference 113, 385–402 (2003)
Bentkus, V., Götze, F., Paulauskas, V., Račkauskas, A.: The accuracy of Gaussian approximation in Banach spaces. In: Prokhorov, Y., Statulevicius, V. (eds.) Limit Theorems of Probability Theory, Chapter II, pp. 25–111. Springer, Berlin (2000)
Campese, S., Nourdin, I., Peccati, G., Poly, G.: Multivariate Gaussian approximations on Markov chaoses. Electron. Commun. Probab. 21, 1–9 (2016)
Chen, X.: Gaussian and bootstrap approximations for high-dimensional U-statistics and their applications. Ann. Stat. 46, 642–678 (2018)
Chen, X., Kato, K.: Randomized incomplete \(U\)-statistics in high dimensions. Ann. Stat. 47, 3127–3156 (2019)
Chen, X., Kato, K.: Jackknife multiplier bootstrap: finite sample approximations to the \(U\)-process supremum with applications. Probab. Theory Relat. Fields 176, 1097–1163 (2020)
Chernozhukov, V., Chetverikov, D., Kato, K.: Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 41, 2786–2819 (2013)
Chernozhukov, V., Chetverikov, D., Kato, K.: Supplement to “Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors”. https://doi.org/10.1214/13-AOS1161SUPP (2013)
Chernozhukov, V., Chetverikov, D., Kato, K.: Gaussian approximation of suprema of empirical processes. Ann. Stat. 42, 1564–1597 (2014)
Chernozhukov, V., Chetverikov, D., Kato, K.: Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Probab. Theory Relat. Fields 162, 47–70 (2015)
Chernozhukov, V., Chetverikov, D., Kato, K.: Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related Gaussian couplings. Stoch. Process. Appl. 126, 3632–3651 (2016)
Chernozhukov, V., Chetverikov, D., Kato, K.: Central limit theorems and bootstrap in high dimensions. Ann. Probab. 45, 2309–2353 (2017)
Chernozhukov, V., Chetverikov, D., Kato, K.: Detailed proof of Nazarov’s inequality. Unpublished paper. arXiv:1711.10696 (2017)
Courtade, T.A., Fathi, M., Pananjady, A.: Existence of Stein kernels under a spectral gap, and discrepancy bounds. Ann. Inst. Henri Poincaré Probab. Stat. 55, 777–790 (2019)
de Jong, P.: A central limit theorem for generalized quadratic forms. Probab. Theory Relat. Fields 75, 261–277 (1987)
de Jong, P.: A central limit theorem for generalized multilinear forms. J. Multivariate Anal. 34, 275–289 (1990)
de la Pena, V.H., Montgomery-Smith, S.J.: Decoupling inequalities for the tail probabilities of multivariate \(U\)-statistics. Ann. Probab. 23, 806–816 (1995)
Deng, H., Zhang, C.-H.: Beyond Gaussian approximation: Bootstrap for maxima of sums of independent random vectors. Ann. Stat. 48, 3643–3671 (2020)
Dette, H., Hetzler, B.: Specification tests indexed by bandwidths. Sankhyā Indian J. Stat. 69, 28–54 (2007)
Dirksen, S.: Tail bounds via generic chaining. Electron. J. Probab. 20, 1–29 (2015)
Döbler, C., Krokowski, K.: On the fourth moment condition for Rademacher chaos. Ann. Inst. Henri Poincaré Probab. Stat. 55, 61–97 (2019)
Döbler, C., Peccati, G.: Quantitative de Jong theorems in any dimension. Electron. J. Probab. 22, 1–35 (2017)
Döbler, C., Vidotto, A., Zheng, G.: Fourth moment theorems on the Poisson space in any dimension. Electron. J. Probab. 23, 1–27 (2018)
Götze, F., Tikhomirov, A.N.: Asymptotic distribution of quadratic forms. Ann. Probab. 27, 1072–1098 (1999)
Hitczenko, P., Montgomery-Smith, S., Oleszkiewicz, K.: Moment inequalities for sums of certain independent symmetric random variables. Studia Math. 123, 15–42 (1997)
Horowitz, J.L., Spokoiny, V.G.: An adaptive, rate-optimal test of a parametric mean-regression model against a nonparametric alternative. Econometrica 69, 599–631 (2001)
Janson, S.: Gaussian Hilbert Space. Cambridge University Press, Cambridge (1997)
Koike, Y.: Mixed-normal limit theorems for multiple Skorohod integrals in high-dimensions, with application to realized covariance. Electron. J. Stat. 13, 1443–1522 (2019)
Koike, Y.: Gaussian approximation of maxima of Wiener functionals and its application to high-frequency data. Ann. Stat. 47, 1663–1687 (2019)
Kuchibhotla, A.K., Chakrabortty, A.: Moving beyond sub-Gaussianity in high-dimensional statistics: applications in covariance estimation and linear regression. Working paper. arXiv:1804.02605 (2018)
Kwapień, S., Woyczyński, W.A.: Random Series and Stochastic Integrals: Single and Multiple. Birkhäuser, Basel (1992)
Ledoux, M.: Chaos of a Markov operator and the fourth moment condition. Ann. Probab. 40, 2439–2459 (2012)
Ledoux, M., Nourdin, I., Peccati, G.: Stein’s method, logarithmic Sobolev and transport inequalities. Geom. Funct. Anal. 25, 256–306 (2015)
Liu, M., Shang, Z., Cheng, G.: Nonparametric testing under random projection. Working paper. arXiv:1802.06308 (2018)
Mossel, E., O’Donnell, R., Oleszkiewicz, K.: Noise stability of functions with low influences: invariance and optimality. Ann. Math. 171, 295–341 (2010)
Nourdin, I., Peccati, G.: Stein’s method and exact Berry–Esseen asymptotics for functionals of Gaussian fields. Ann. Probab. 37, 2231–2261 (2009)
Nourdin, I., Peccati, G.: Stein’s method on Wiener chaos. Probab. Theory Relat. Fields 145, 75–118 (2009)
Nourdin, I., Peccati, G.: Normal Approximations with Malliavin Calculus: From Stein’s Method to Universality. Cambridge University Press, Cambridge (2012)
Nourdin, I., Peccati, G., Poly, G., Simone, R.: Classical and free fourth moment theorems: Universality and thresholds. J. Theoret. Probab. 29, 653–680 (2016)
Nourdin, I., Peccati, G., Poly, G., Simone, R.: Multidimensional limit theorems for homogeneous sums: a survey and a general transfer principle. ESAIM Probab. Stat. 20, 293–308 (2016)
Nourdin, I., Peccati, G., Reinert, G.: Invariance principles for homogeneous sums: Universality of Gaussian Wiener chaos. Ann. Probab. 38, 1947–1985 (2010)
Nourdin, I., Peccati, G., Reinert, G.: Stein’s method and stochastic analysis of Rademacher functionals. Electron. J. Probab. 15, 1703–1742 (2010)
Nualart, D., Peccati, G.: Central limit theorems for sequences of multiple stochastic integrals. Ann. Probab. 33, 177–193 (2005)
Paulauskas, V.: A note on the rate of convergence in the CLT for empirical processes. Lith. Math. J. 32, 312–316 (1992)
Peccati, G., Tudor, C.A.: Gaussian limits for vector-valued multiple stochastic integrals. In: Émery, M., Ledoux, M., Yor, M. (eds.), Séminaire de probabilitiés XXXVIII, vol. 1857 of Lecture Notes in Math. Springer, pp. 247–262 (2005)
Peccati, G., Zheng, C.: Universal Gaussian fluctuations on the discrete Poisson chaos. Bernoulli 20, 697–715 (2014)
Rotar, V.I.: Limit theorems for multilinear forms and quasipolynomial functions. Theory Probab. Appl. 20, 512–532 (1975)
Rotar’, V.I.: Limit theorems for polylinear forms. J. Multivariate Anal. 9, 511–530 (1979)
Rudin, W.: Real and Complex Analysis, 3rd edn. McGraw-Hill, New York (1987)
Song, Y., Chen, X., Kato, K.: Approximating high-dimensional infinite-order \(U\)-statistics: statistical and computational guarantees. Electron. J. Stat. 13, 4794–4848 (2019)
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, Berlin (1996)
Zhai, A.: A high-dimensional CLT in \({\cal{W}}_2\) distance with near optimal convergence rate. Probab. Theory Relat. Fields 170, 821–845 (2018)
Zheng, G.: A Peccati–Tudor type theorem for Rademacher chaoses. ESAIM Probab. Stat. 23, 874–892 (2019)
Acknowledgements
The author is grateful to an anonymous referee for his or her constructive comments. Thanks are also due to the participants at the Osaka Probability Seminar on November 28, 2017, for insightful comments. The author also thanks Professor Giovanni Peccati for having indicated that the same type bound as Corollary 3.1 has already appeared in [4, Theorem 3.1].
Open Access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Funding
This study was funded by JST CREST (Grant Number JPMJCR14D7) and JSPS KAKENHI (Grant Numbers JP16K17105, JP17H01100, JP18H00836).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that he has no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Properties of the \(\psi _\alpha \)-norm
Here we collect several properties of the \(\psi _\alpha \)-norm used in this paper. Let \(\alpha \) be a positive number. Recall that the \(\psi _\alpha \)-norm of a random variable X is defined by
where \(\psi _\alpha (x):=\exp (x^\alpha )-1\). From the definition, we can easily deduce the following useful identity:
Using this relation, we can derive the properties of the \(\psi _\alpha \)-norm from those of the \(\psi _1\)-norm. This is very convenient because the latter ones are well-studied in the literature. For example, since \(\Vert \cdot \Vert _{\psi _1}\) satisfies the triangle inequality, we have
for any random variables X, Y. Also, using Young’s inequality for products and Hölder’s inequality, one can prove \(\Vert X\Vert _{\psi _1}\le (\log 2)^{1/p-1}\Vert X\Vert _{\psi _p}\) for any random variable X and \(p>1\). So we obtain \( \Vert X\Vert _{\psi _\alpha }\le (\log 2)^{1/\beta -1/\alpha }\Vert X\Vert _{\psi _\beta } \) for any \(0<\alpha \le \beta <\infty \). Other useful results can be obtained from [57, Lemmas 2.2.1–2.2.2]:
Lemma A.1
Suppose that there are constants \(C,K>0\) such that \(P(|X|>x)\le Ke^{-Cx^\alpha }\) for all \(x>0\). Then, we have \(\Vert X\Vert _{\psi _\alpha }\le \left( (1+K)/C\right) ^{1/\alpha }\).
Lemma A.2
There is a universal constant \(K>0\) such that
for any \(\alpha >0\) and random variables \(X_1,\dots ,X_d\).
It is also easy to check that \(\Vert X\Vert _{\psi _\alpha }\) attains the infimum in (A.1) if \(\Vert X\Vert _{\psi _\alpha }<\infty \). That is, \(\mathrm {E}[\psi _\alpha (|X|/\Vert X\Vert _{\psi _\alpha })]\le 1\). Therefore, the Markov inequality yields the following converse of Lemma A.1:
Lemma A.3
If \(\Vert X\Vert _{\psi _\alpha }<\infty \), we have \(P(|X|\ge x)\le 2e^{-(x/\Vert X\Vert _{\psi _\alpha })^\alpha }\) for every \(x>0\).
Next, we investigate the relation between the \(\psi _\alpha \)-norm and moment growth. First, [26, Lemma A.1] yields the following result:
Lemma A.4
If there is a constant \(A>0\) such that \(\Vert X\Vert _p\le Ap^{1/\alpha }\) for all \(p\ge 1\), then \( P(|X|\ge x)\le e^{1/\alpha }e^{-(\alpha e)^{-1}(x/A)^\alpha } \) for every \(x>0\).
Combining Lemma A.4 with Lemma A.1, we obtain the following result:
Lemma A.5
Suppose that there is a constant \(A>0\) such that \(\Vert X\Vert _p\le Ap^{1/\alpha }\) for all \(p\ge 1\). Then, we have \(\Vert X\Vert _{\psi _\alpha }\le \left( (1+e^{1/\alpha })\alpha e\right) ^{1/\alpha }A\).
Lemma A.3 and [26, Lemma A.2] yield the following converse of Lemma A.5:
Lemma A.6
For all \(p\ge 1\), it holds that \(\Vert X\Vert _p\le c_\alpha \Vert X\Vert _{\psi _\alpha }p^{1/\alpha }\) with \( c_\alpha :=e^{1/2e-1/\alpha }\alpha ^{-1/\alpha }\max \left\{ 1,2\sqrt{\frac{2\pi }{\alpha }}e^{\alpha /12}\right\} . \)
Finally, we have the following Hölder-type inequality for the \(\psi _\alpha \)-norm:
Lemma A.7
([36], Proposition S.3.2) Let \(X_1,X_2\) be two random variables such that \(\Vert X_1\Vert _{\psi _{\alpha _1}}+\Vert X_2\Vert _{\psi _{\alpha _2}}<\infty \) for some \(\alpha _1,\alpha _2>0\). Then, we have \( \Vert X_1X_2\Vert _{\psi _\alpha }\le \Vert X_1\Vert _{\psi _{\alpha _1}}\Vert X_2\Vert _{\psi _{\alpha _2}}, \) where \(\alpha >0\) is defined by the equation \(1/\alpha =1/\alpha _1+1/\alpha _2\).
Proof of Lemma 6.2
Lemma B.1
(Strong domination) Let \((\xi _i)_{i=1}^N\) and \((\theta _i)_{i=1}^N\) be two sequences of independent symmetric random variables. Suppose that there is an integer \(k>0\) such that \(P(|\xi _i|>t)\le k P(|\theta _i|>t)\) for all \(i\in [N]\) and \(t>0\). Then, for any \(p\ge 1\) and \(a_1,\dots ,a_N\in {\mathbb {R}}\) we have
Proof
This is a consequence of Theorem 3.2.1 and Corollary 3.2.1 in [37]. \(\square \)
Lemma B.2
Let \((\xi _i)_{i\in {\mathbb {N}}}\) be a sequence of independent copies of a symmetric random variable \(\xi \) satisfying \(P(|\xi |\ge t)=e^{-|t|^\alpha }\) for every \(t\ge 0\) and some \(0<\alpha \le 2\). Then, there is a constant \(C_\alpha >0\) which depends only on \(\alpha \) such that
for any \(p\ge 2\), \(N\in {\mathbb {N}}\) and \(a_1,\dots ,a_N\in {\mathbb {R}}\).
Proof
Note that, by Lemmas A.1 and A.6, there is a constant \(K_\alpha >0\) depending only on \(\alpha \) such that \(\Vert \xi \Vert _r\le K_\alpha r^\alpha \) for all \(r\ge 1\). Then, the result follows from [31, Theorem 1.1] when \(\alpha \le 1\) and the Gluskin–Kwapień inequality (cf. page 17 of [31]) when \(\alpha >1\). \(\square \)
Lemma B.3
Let \((\zeta _i)_{i=1}^N\) be a sequence of independent centered random variables such that \(M:=\max _{1\le i\le N}\Vert \zeta _i\Vert _{\psi _\alpha }<\infty \) for some \(0<\alpha \le 2\). Then, we have
for any \(p\ge 1\) and \(a_1,\dots ,a_N\in {\mathbb {R}}\), where \(K_\alpha >0\) depends only on \(\alpha \).
Proof
Thanks to symmetrization inequalities (see, e.g., [57, Lemma 2.3.1]), it suffices to consider the case that \(\zeta _i\) is symmetric for all i. Then, the result follows from Lemmas A.3, B.1 and B.2 . \(\square \)
Lemma B.4
Let \((X_{i,j})_{1\le i\le N,1\le j\le q}\) be an array of independent centered random variables. Assume \(M:=\max _{1\le i\le N,1\le j\le q}\Vert X_{i,j}\Vert _{\psi _\alpha }<\infty \) for some \(0<\alpha \le 2\). Then,
for any \(p\ge 2\) and \(f:[N]^q\rightarrow {\mathbb {R}}\), where \(K_{\alpha }>0\) is the same as in Lemma B.3.
Proof
This can easily be shown by induction on q and using Lemma B.3. \(\square \)
Proof of Lemma 6.2
The claim is an immediate consequence of [23, Theorem 1], [55, Theorem 8.16] and Lemma B.4. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Koike, Y. High-Dimensional Central Limit Theorems for Homogeneous Sums. J Theor Probab 36, 1–45 (2023). https://doi.org/10.1007/s10959-022-01156-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10959-022-01156-2
Keywords
- de Jong’s theorem
- Fourth-moment theorem
- High dimensions
- Peccati–Tudor-type theorem
- Quantitative CLT
- Randomized Lindeberg method
- Stein kernel
- Universality