High-Dimensional Central Limit Theorems for Homogeneous Sums

Koike, Yuta

doi:10.1007/s10959-022-01156-2

High-Dimensional Central Limit Theorems for Homogeneous Sums

Open access
Published: 26 February 2022

Volume 36, pages 1–45, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Theoretical Probability Aims and scope Submit manuscript

High-Dimensional Central Limit Theorems for Homogeneous Sums

Download PDF

Yuta Koike ORCID: orcid.org/0000-0002-6442-2603¹

2182 Accesses
3 Citations
Explore all metrics

Abstract

This paper develops a quantitative version of de Jong’s central limit theorem for homogeneous sums in a high-dimensional setting. More precisely, under appropriate moment assumptions, we establish an upper bound for the Kolmogorov distance between a multi-dimensional vector of homogeneous sums and a Gaussian vector so that the bound depends polynomially on the logarithm of the dimension and is governed by the fourth cumulants and the maximal influences of the components. As a corollary, we obtain high-dimensional versions of fourth-moment theorems, universality results and Peccati–Tudor-type theorems for homogeneous sums. We also sharpen some existing (quantitative) central limit theorems by applications of our result.

Convergence arguments to bridge cauchy and matérn covariance functions

Article 15 February 2023

Strong asymptotic freeness for independent uniform variables on compact groups associated to nontrivial representations

Article 07 May 2024

Hermite polynomials linking Szász–Durrmeyer operators

Article 05 May 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $\varvec{X}=(X_i)_{i=1}^\infty $ be a sequence of independent centered random variables with unit variance. A homogeneous sum is a random variable of the form

$$\begin{aligned} Q(f;\varvec{X})=\sum _{i_1,\dots ,i_p=1}^Nf(i_1,\dots ,i_q)X_{i_1}\cdots X_{i_q}, \end{aligned}$$

where $N,q\in {\mathbb {N}}$, $[N]:=\{1,\dots ,N\}$ and $f:[N]^q\rightarrow {\mathbb {R}}$ is a symmetric function vanishing on diagonals, i.e., $f(i_1,\dots ,i_q)=0$ unless $i_1,\dots ,i_q$ are mutually different. Studies of limit theorems for a sequence of homogeneous sums have some history in probability theory. Rotar’ [53, 54] investigated invariance principles for $Q(f;\varvec{X})$ regarding the law of $\varvec{X}$. In the notable work of de Jong [22], the following striking result has been established: For every $n\in {\mathbb {N}}$, let $f_n:[N_n]^q\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals with q fixed and $N_n\uparrow \infty $ as $n\rightarrow \infty $. Assume $\mathrm {E}[X_i^4]<\infty $ for all i and $\mathrm {E}[Q(f_n;\varvec{X})^2]=1$ for all n. Then, $Q(f_n;\varvec{X})$ converges in law to the standard normal distribution, provided that the following two conditions hold true:

(i)
$\mathrm {E}[Q(f_n;\varvec{X})^4]\rightarrow 3$ as $n\rightarrow \infty $.
(ii)
$\max _{1\le i\le N_n}{{\,\mathrm{Inf}\,}}_i(f_n)\rightarrow 0$ as $n\rightarrow \infty $, where ${{\,\mathrm{Inf}\,}}_i(f_n)$ is defined by
$$\begin{aligned} {{\,\mathrm{Inf}\,}}_i(f_n):=\sum _{i_2,\dots ,i_q=1}^{N_n}f_n(i,i_2,\dots ,i_q)^2 \end{aligned}$$
(1.1)
and called the influence of the ith variable of $f_n$.

When $q=1$, condition (ii) says that $\max _{1\le i\le N_n}f_n(i)^2\rightarrow 0$ as $n\rightarrow \infty $, which is equivalent to the celebrated Lindeberg condition. In this case condition (i) is always implied by (ii), and thus it is an extra one. In contrast, when $q\ge 2$, condition (ii) is no longer sufficient for the asymptotic normality of the sequence $(Q(f_n;\varvec{X}))_{n=1}^\infty $, so one needs an additional condition. The motivation of introducing condition (i) in [22] was that one can easily check condition (i) is equivalent to the asymptotic normality of $(Q(f_n;\varvec{X}))_{n=1}^\infty $ when $q=2$ and $\varvec{X}$ is Gaussian (see also [21]). Later on, this observation was significantly improved in the influential paper by Nualart & Peccati [49]: For any q, the asymptotic normality of $(Q(f_n;\varvec{X}))_{n=1}^\infty $ is implied just by condition (i) as long as $\varvec{X}$ is Gaussian. Results of this type are nowadays called fourth-moment theorems and have been extensively studied in the past decade. In particular, further investigation of the fourth-moment theorem in [49] has led to the introduction of the so-called Malliavin–Stein method by Nourdin & Peccati [43], which have produced one of the most active research areas in the recent probabilistic literature. We refer the reader to the monograph [44] for an introduction to this subject and the survey [3] for recent developments.

Implication of the Malliavin–Stein method to de Jong’s central limit theorem (CLT) for homogeneous sums has been investigated in the seminal work of Nourdin, Peccati & Reinert [47], where several important extensions of de Jong’s result have been developed. The following three results are particularly relevant to our work:

(I)
First, they have established a multi-dimensional extension of de Jong’s CLT which shows multi-dimensional vectors of homogeneous sums enjoy a CLT if de Jong’s criterion is satisfied component-wise. More precisely, let $d\in {\mathbb {N}}$ and, for every $j=1,\dots ,d$, let $q_j\in {\mathbb {N}}$ and $f_{n,j}:[N_n]^{q_j}\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals. Also, let $\mathfrak {C}=(\mathfrak {C}_{jk})_{1\le j,k\le d}$ be a $d\times d$ positive semidefinite symmetric matrix and suppose that
$$\begin{aligned} \max _{1\le j,k\le d}|\mathrm {E}[Q(f_{n,j};\varvec{X})Q(f_{n,k};\varvec{X})]-\mathfrak {C}_{jk}|\rightarrow 0 \end{aligned}$$
as $n\rightarrow \infty $. Then, the d-dimensional random vector
$$\begin{aligned} \varvec{Q}^{(n)}(\varvec{X}):=(Q(f_{n,1};\varvec{X}),\dots ,Q(f_{n,d};\varvec{X})) \end{aligned}$$
converges in law to the d-dimensional normal distribution ${\mathcal {N}}_d(0,\mathfrak {C})$ with mean 0 and covariance matrix $\mathfrak {C}$ as $n\rightarrow \infty $ if $\mathrm {E}[Q(f_{n,j};\varvec{X})^4]-3\mathrm {E}[Q(f_{n,j};\varvec{X})^2]^2\rightarrow 0$ and $\max _{1\le i\le N_n}{{\,\mathrm{Inf}\,}}_i(f_{n,j})\rightarrow 0$ as $n\rightarrow \infty $ for every $j=1,\dots ,d$.
(II)
Second, they have found the following universality of Gaussian variables in the context of homogeneous sums ( [47, Theorem 1.2]): Assume
$$\begin{aligned} \sup _n\sum _{i_1,\dots ,i_q=1}^{N_n}f_{n,j}(i_1,i_2,\dots ,i_{q_j})^2<\infty \end{aligned}$$
and $\mathfrak {C}_{jj}>0$ for every j. Then, if $\varvec{Q}^{(n)}(\varvec{G})$ converges in law to ${\mathcal {N}}_d(0,\mathfrak {C})$ as $n\rightarrow \infty $ for a sequence of standard Gaussian variables $\varvec{G}=(G_i)_{i=1}^\infty $, then $\varvec{Q}^{(n)}(\varvec{X})$ converges in law to ${\mathcal {N}}_d(0,\mathfrak {C})$ as $n\rightarrow \infty $ for any sequence $\varvec{X}=(X_i)_{i=1}^\infty $ of independent centered random variables with unit variance and such that $\sup _i\mathrm {E}[|X_i|^3]<\infty $.
(III)
Third, they have established some quantitative versions of de Jong’s CLT for homogeneous sums; see Proposition 5.4 and Corollary 7.3 in [47] for details (see also Sect. 2.1.1).

We remark that these results have been generalized in various directions by subsequent studies. For example, the universality results analogous to (II) have also been established for Poisson variables in Peccati & Zheng [52] and i.i.d. variables with zero skewness and nonnegative excess kurtosis in Nourdin et al. [45, 46], respectively. Also, the recent work of Döbler & Peccati [28] has extended (I) and (II) to more general degenerate U-statistics which were originally treated in [22].

As the title of the paper suggests, the aim of this paper is to extend the above results to a high-dimensional setting where the dimension d depends on n and $d=d_n\rightarrow \infty $ as $n\rightarrow \infty $. Of course, in such a setting, the “asymptotic distribution” $\mathcal {N}_d(0,\mathfrak {C})$ also depends on n and, even worse, it is typically no longer tight. Therefore, we need to properly reformulate the above statements in this setting. In this paper, we adopt the so-called metric approach to accomplish this purpose: We try to establish the convergence of some metric between the laws of $\varvec{Q}^{(n)}(\varvec{X})$ and $\mathcal {N}_d(0,\mathfrak {C})$. Specifically, we take the Kolmogorov distance as the metric between the probability laws. Namely, letting $Z^{(n)}$ be a $d_n$-dimensional centered Gaussian vector with covariance matrix $\mathfrak {C}_n$ for each n, we aim at proving the following convergence:

$$\begin{aligned} \sup _{x\in {\mathbb {R}}^{d_n}}|P(\varvec{Q}^{(n)}(\varvec{X})\le x)-P(Z^{(n)}\le x)|\rightarrow 0\quad \text {as }n\rightarrow \infty . \end{aligned}$$

Here, for vectors $x=(x_1,\dots ,x_{d_n})\in {\mathbb {R}}^{d_n}$ and $y=(y_1,\dots ,y_{d_n})\in {\mathbb {R}}^{d_n}$, we write $x\le y$ to express $x_j\le y_j$ for every $j=1,\dots ,d_n$. In addition, we are particularly interested in a situation where the dimension $d=d_n$ increases extremely faster than the “standard” convergence rate of Gaussian approximation for a sequence of univariate homogeneous sums. Given that both $\sqrt{|\mathrm {E}[Q(f_n;\varvec{X})^4]-3\mathrm {E}[Q(f_{n};\varvec{X})^2]^2|}$ and $\max _{1\le i\le N_n}\sqrt{{{\,\mathrm{Inf}\,}}_i(f_n)}$ can be the optimal convergence rates of the Gaussian approximation of $Q(f_{n};\varvec{X})$ in the Kolmogorov distance (see [42, Proposition 3.8] for the former and [30, Remark 1] for the latter), we might consider the quantity

$$\begin{aligned} \delta _n:=\max _{1\le j\le d_n}\sqrt{|\mathrm {E}[Q(f_{n,j};\varvec{X})^4]-3\mathrm {E}[Q(f_{n,j};\varvec{X})^2]^2|+\max _{1\le i\le N_n}{{\,\mathrm{Inf}\,}}_i(f_{n,j})} \end{aligned}$$

as an appropriate definition of the “standard” convergence rate. Then, we aim at proving

$$\begin{aligned} \sup _{x\in {\mathbb {R}}^{d_n}}|P(\varvec{Q}^{(n)}(\varvec{X})\le x)-P(Z^{(n)}\le x)|\le C(\log d_n)^a\delta _n^b \end{aligned}$$

(1.2)

for all $n\in {\mathbb {N}}$, where $a,b,C>0$ are constants which do not depend on n (here and below we assume $d_n\ge 2$). As a byproduct, results of this type enable us to extend fourth-moment theorems and universality results for homogeneous sums to a high-dimensional setting (see Theorem 2.2 for the precise statement).

Our formulation of a high-dimensional extension of CLTs for homogeneous sums is motivated by the recent path-breaking work of Chernozhukov, Chetverikov & Kato [13, 18], where results analogous to (1.2) have been established for sums of independent random vectors. More formally, let $(\xi _{n,i})_{i=1}^n$ be a sequence of independent centered $d_n$-dimensional random vectors. Set $S_n:=n^{-1/2}\sum _{i=1}^n\xi _{n,i}$ and assume $\mathfrak {C}_n=\mathrm {E}[S_nS_n^\top ]$ ($\top $ denotes the transpose of a matrix). Then, under an appropriate assumption on moments, we have

$$\begin{aligned} \sup _{x\in {\mathbb {R}}^{d_n}}|P(S_n\le x)-P(Z^{(n)}\le x)|\le C'\left( \frac{\log ^7(d_nn)}{n}\right) ^{1/6}, \end{aligned}$$

(1.3)

where $C'>0$ is a constant which does not depend on n (see Proposition 2.1 in [18] for the precise statement). Here, we shall remark that the bound in (1.3) depends on n through $n^{-1/6}$, which is suboptimal when the dimension $d_n$ is fixed. However, in [18, Remark 2.1(ii)] it is conjectured that the rate $n^{-1/6}$ is nearly optimal in a minimax sense when $d_n$ is extremely larger than n (see also [10, Remark 1]). This conjecture is motivated by the fact that the rate $n^{-1/6}$ is minimax optimal in CLTs for sums of independent random variables taking values in an infinite-dimensional Banach space (see, e.g., [8, Theorem 2.6]). Given that high-dimensional CLTs of type (1.3) are closely related to Gaussian approximation of the suprema of empirical processes (see, e.g., [15, 17]), it would be worth mentioning that a duality argument enables us to translate the minimax rate for CLTs in a Banach space to the one for Gaussian approximation of the suprema of empirical processes with a specific class of functions in the Kolmogorov distance; see [50] for details. For this reason, we also conjecture that $b=1/3$ would give an optimal dependence on $\delta _n$ of the bound in (1.2) (note that the rate $n^{-1/2}$ is the standard convergence rate of CLTs for sums of independent one-dimensional random variables). In this paper, we indeed establish that the bound of type (1.2) holds true with $b=1/3$ under a moment assumption on $\varvec{X}$ when $q_j$’s do not depend on j (see Theorem 2.1 and Remark 2.1).

We remark that there are a number of articles which extend the scope of the Chernozhukov–Chetverikov–Kato theory (CCK theory for short) in various directions. We refer the reader to the survey [6] for recent developments. Nevertheless, most studies focus on linear statistics (i.e., sums of random variables) and there are only a few articles concerned with nonlinear statistics. Two exceptions are U-statistics developed in [10,11,12, 56] and Wiener functionals developed in [34, 35]. On the one hand, however, the former are mainly concerned with non-degenerate U-statistics which are approximately linear statistics via Hoeffding decomposition (Chen & Kato [11] also handle degenerate U-statistics, but they focus on the randomized incomplete versions that are still approximately linear statistics). On the other hand, although the latter deal with essentially nonlinear statistics, they must be functionals of a (possibly infinite-dimensional) Gaussian process, except for [35, Theorem 3.2] that is a version of our result with $q_j\equiv 2$ (see Sect. 2.1.2 for more details). In this sense, our result would be the first extension of CCK-type results to essentially nonlinear statistics based on possibly non-Gaussian variables.

Finally, we mention that the main results of this paper have potential applications to statistics. In fact, the original motivation of this paper is to improve the Gaussian approximation result for maxima of high-dimensional vectors of random quadratic forms given by [35, Theorem 3.2], which is used to ensure the validity of the bootstrap testing procedure proposed in [35, Section 4.1] (see Sect. 2.2). Another potential application might be specification test for parametric form in nonparametric regression. In this area, to derive the null distributions of test statistics, one sometimes needs to approximate the maximum of (essentially degenerate) quadratic forms; see [25, 32, 40] for instance.

This paper is organized as follows. Section 2 presents the main results obtained in the paper, while Sects. 3–7 are devoted to the proof of the main results: Sect. 3 demonstrates a basic scheme of the CCK theory to prove high-dimensional CLTs. Subsequently, Sect. 4 presents a connection of this scheme to Stein’s method. Based on this observation, Sect. 5 develops a high-dimensional CLT of the form (1.2) for homogeneous sums based on normal and gamma variables. Then, Sect. 6 establishes a kind of invariance principle for high-dimensional homogeneous sums using a randomized version of the Lindeberg method. Finally, Sect. 7 completes the proof of the main results.

2 Notation

${\mathbb {Z}}_+$ denotes the set of all nonnegative integers. For $x=(x_1,\dots ,x_d)\in {\mathbb {R}}^d$, we define $\Vert x\Vert _{\ell _\infty }:=\max _{1\le j\le d}|x_j|$. For $N\in {\mathbb {N}}$, we set $[N]:=\{1,\dots ,N\}$. We set $\sum _{i=p}^q\equiv 0$ if $p>q$ by convention. For $q\in {\mathbb {N}}$, we denote by $\mathfrak {S}_q$ the set of all permutations of [q], i.e., the symmetric group of degree q. For a function $f:[N]^q\rightarrow {\mathbb {R}}$, we set ${\mathcal {M}}(f):=\max _{1\le i\le N}{{\,\mathrm{Inf}\,}}_i(f)$ (recall that ${{\,\mathrm{Inf}\,}}_i(f)$ is defined according to (1.1)). We also set $ \Vert f\Vert _{\ell _2}:=\sqrt{\sum _{i_1,\dots ,i_q=1}^Nf(i_1,\dots ,i_q)^2}. $ For a function $h:{\mathbb {R}}^d\rightarrow {\mathbb {R}}$, we set $\Vert h\Vert _\infty :=\sup _{x\in {\mathbb {R}}^d}|h(x)|$. We write $C^m_b({\mathbb {R}}^d)$ for the set of all real-valued $C^m$ functions on ${\mathbb {R}}^d$ all of whose partial derivatives are bounded. We write $\partial _{j_1\dots j_m}=\frac{\partial ^m}{\partial x_{j_1}\cdots \partial x_{j_m}}$ for short. Throughout the paper, $Z=(Z_1,\dots ,Z_d)$ denotes a d-dimensional centered Gaussian random vector with covariance matrix ${\mathfrak {C}}=({\mathfrak {C}}_{ij})_{1\le i,j\le d}$ (note that we do not assume that $\mathfrak {C}$ is positive definite in general). Also, $(q_j)_{j=1}^\infty $ stands for a sequence of positive integers. Throughout the paper, we will regard $(q_j)_{j=1}^\infty $ as fixed, i.e., it does not vary when we consider asymptotic results. Given a probability distribution $\mu $, we write $X\sim \mu $ to express that X is a random variable with distribution $\mu $. For $\nu >0$, we write $\gamma (\nu )$ for the gamma distribution with shape $\nu $ and rate 1. If S is a topological space, $\mathcal {B}(S)$ denotes the Borel $\sigma $-field of S.

Given a random variable X, we set $\Vert X\Vert _p:=\{\mathrm {E}[|X|^p]\}^{1/p}$ for every $p>0$. When X satisfies $\mathrm {E}[X^4]<\infty $, we denote the fourth cumulant of X by $\kappa _4(X)$. Note that $\kappa _4(X)=\mathrm {E}[X^4]-3\mathrm {E}[X^2]^2$ if X is centered. For $\alpha >0$, we define the $\psi _\alpha $-norm of X by $ \Vert X\Vert _{\psi _\alpha }:=\inf \{C>0:\mathrm {E}[\psi _\alpha (|X|/C)]\le 1\}, $ where $\psi _\alpha (x):=\exp (x^\alpha )-1$. Note that $\Vert \cdot \Vert _{\psi _\alpha }$ is indeed a norm (on a suitable space) if and only if $\alpha \ge 1$. Some useful properties of the $\psi _\alpha $-norm are collected in Appendix A.

3 Main Results

Our first main result is a high-dimensional version of de Jong’s CLT for homogeneous sums:

Theorem 2.1

Let $\varvec{X}=(X_i)_{i=1}^N$ be a sequence of independent centered random variables with unit variance. Set $w=\frac{1}{2}$ if $\mathrm {E}[X_i^3]=0$ for every $i\in [N]$ and $w=1$ otherwise. For every $j\in [d]$, let $f_j:[N]^{q_j}\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals, and set $\varvec{Q}(\varvec{X}):=(Q(f_1;\varvec{X}),\dots ,Q(f_d;\varvec{X}))$. Suppose that $d\ge 2$, $\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0$ and $\max _{1\le i\le N}\Vert X_i\Vert _{\psi _\alpha }<\infty $ for some $\alpha \in (0,w^{-1}]$. Then,

$$\begin{aligned}&\sup _{x\in {\mathbb {R}}^d}\left| P(\varvec{Q}(\varvec{X})\le x)-P(Z\le x)\right| \nonumber \\&\quad \le C(1+\underline{\sigma }^{-1})\left\{ (\log d)^{\frac{2}{3}}\delta _0[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}\right. \nonumber \\&\left. \qquad +(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}} +(\log d)^{\frac{2\overline{q}_d-1}{\alpha }+\frac{3}{2}}\max _{1\le k\le d}\overline{B}_N^{q_k}\sqrt{{\mathcal {M}}(f_k)}\right\} , \end{aligned}$$

(2.1)

where $\overline{B}_N:=\max _{1\le i\le N}(\Vert X_i\Vert _{\psi _\alpha }\vee |\mathrm {E}[X_i^3]|)$, $\overline{q}_d:=\max _{1\le j\le d}q_j$, $\mu :=\max \{\frac{2}{3}w\overline{q}_d-\frac{1}{6},\frac{2(\overline{q}_d-1)}{3\alpha }+\frac{1}{3}\}$, $C>0$ depends only on $\alpha ,\overline{q}_d$ and

$$\begin{aligned} \delta _0[\varvec{Q}(\varvec{X})]&:=\max _{1\le j,k\le d}\left| \mathrm {E}[Q(f_j;\varvec{X})Q(f_k;\varvec{X})]-{\mathfrak {C}}_{jk}\right| ,\\ \delta _1[\varvec{Q}(\varvec{X})]&:= \overline{A}_N^{2w\overline{q}_d-1} \max _{1\le j,k\le d}\left\{ 1_{\{q_j=q_k\}}\sqrt{|\kappa _4(Q(f_k;\varvec{X}))|+\overline{A}_N^{4q_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2}\right. \\&\quad \left. +1_{\{q_j< q_k\}}\overline{A}_N^{q_j}\Vert f_j\Vert _{\ell _2}\left( |\kappa _4(Q(f_k;\varvec{X}))|+\overline{A}_N^{4q_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2\right) ^{1/4}\right\} \end{aligned}$$

with $\overline{A}_N:=\max _{1\le i\le N}(|\mathrm {E}[X_i^3]|\vee {\Vert X_i\Vert _4})$.

Remark 2.1

(a)
Since $\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2\le \Vert f_k\Vert _{\ell _2}^2\mathcal {M}(f_k)$, Theorem 2.1 gives the bound of the form (1.2) under reasonable assumptions when $q_1=\cdots =q_d$. For example, this is the case when $\mathrm {E}[Q(f_j;\varvec{X})Q(f_k;\varvec{X})]={\mathfrak {C}}_{jk}$ for all $j,k\in [d]$, $\sup _i\Vert X_i\Vert _{\psi _\alpha }<\infty $ and $\sup _j\Vert f_j\Vert _{\ell _2}<\infty $. Here, we keep $\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2$ rather than $\Vert f_k\Vert _{\ell _2}^2\mathcal {M}(f_k)$ for the convenience of application.
(b)
When $q_j<q_k$ for some $j,k\in [d]$, the exponents of $|\kappa _4(Q(f_k;\varvec{X}))|$ and $\mathcal {M}(f_k)$ appearing in the bound of (2.1) are 1/12, which are halves of those for the case $q_j=q_k$. This phenomenon is not specific to the high-dimensional setting but common in fourth-moment-type theorems. See Remark 1.9(a) in [29] for more details.
(c)
In Sect. 2.1, we compare Theorem 2.1 to two existing results in some detail. The results therein show the dependence of the bound in (2.1) on the dimension d is as sharp as (and sometimes sharper than) the previous results.

We can easily extend Theorem 2.1 to a high-dimensional CLT for homogeneous sums in hyperrectangles as follows. Let ${\mathcal {A}}^\mathrm {re}(d)$ be the set of all hyperrectangles in ${\mathbb {R}}^d$, i.e., ${\mathcal {A}}^\mathrm {re}(d)$ consists of all sets A of the form

$$\begin{aligned} A=\{(x_1,\dots ,x_d)\in {\mathbb {R}}^d:a_j\le x_j\le b_j\text { for all }j=1,\dots ,d\} \end{aligned}$$

for some $-\infty \le a_j\le b_j\le \infty $, $j=1,\dots ,d$.

Corollary 2.1

Under the assumptions of Theorem 2.1, we have

$$\begin{aligned}&\sup _{A\in {\mathcal {A}}^\mathrm {re}(d)}\left| P(\varvec{Q}(\varvec{X})\in A)-P(Z\in A)\right| \\&\quad \le C'(1+\underline{\sigma }^{-1})\left\{ (\log d)^{\frac{2}{3}}\delta _0[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}\right. \\&\quad \qquad \left. +(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}} +(\log d)^{\frac{2\overline{q}_d-1}{\alpha }+\frac{3}{2}}\max _{1\le k\le d}\overline{B}_N^{q_k}\sqrt{{\mathcal {M}}(f_k)}\right\} , \end{aligned}$$

where $C'>0$ depends only on $\alpha ,\overline{q}_d$.

For application, it is often useful to restate Theorem 2.1 in an asymptotic form as follows.

Corollary 2.2

Let $\varvec{X}=(X_i)_{i=1}^\infty $ be a sequence of independent centered random variables with unit variance. Set $w=\frac{1}{2}$ if $\mathrm {E}[X_i^3]=0$ for every $i\in {\mathbb {N}}$ and $w=1$ otherwise. For every $n\in {\mathbb {N}}$, let $N_n,d_n\in {\mathbb {N}}\setminus \{1\}$ and $f_{n,k}:[N_n]^{q_{ k}}\rightarrow {\mathbb {R}}$ ($k=1,\dots ,d_n$) be symmetric functions vanishing on diagonals, and set $\varvec{Q}^{(n)}(\varvec{X}):=(Q(f_{n,1};\varvec{X}),\dots ,Q(f_{n,d};\varvec{X}))$. Moreover, for every $n\in {\mathbb {N}}$, let $Z^{(n)}=(Z_{n,1},\dots ,Z_{n,d_n})$ be a $d_n$-dimensional centered Gaussian vector with covariance matrix $\mathfrak {C}_n=(\mathfrak {C}_{n,kl})_{1\le k,l\le d_n}$. Suppose that $\overline{q}_\infty :=\sup _{j\in {\mathbb {N}}}q_j<\infty $, $\inf _{n\in {\mathbb {N}}}\min _{1\le k\le d_n}\Vert Z_{n,k}\Vert _2>0$, $\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _\alpha }<\infty $ for some $\alpha \in (0,w^{-1}]$ and

$$\begin{aligned} (\log d_n)^2\max _{1\le k,l\le d_n}|\mathrm {E}[Q(f_{n,k};\varvec{X})Q(f_{n,l};\varvec{X})]-\mathfrak {C}_{n,kl}|\rightarrow 0 \end{aligned}$$

(2.2)

as $n\rightarrow \infty $. Moreover, setting $a_1:=(4w\overline{q}_\infty -2)\vee (4\alpha ^{-1}(\overline{q}_\infty -1)+5)$ and $a_2:=2\alpha ^{-1}(2\overline{q}_\infty -1)+3$, we suppose that either one of the following conditions is satisfied:

(i)
$(\log d_n)^{2a_1}\max _{1\le j\le d_n}|\kappa _4(Q(f_{n,j};\varvec{X}))|\rightarrow 0$ and $(\log d_n)^{2a_1\vee a_2}\max _{1\le j\le d_n}{\mathcal {M}}(f_{n,j})\rightarrow 0$ as $n\rightarrow \infty $.
(ii)
$(\log d_n)^{a_1}\max _{1\le j\le d_n}|\kappa _4(Q(f_{n,j};\varvec{X}))|\rightarrow 0$ and $(\log d_n)^{a_1\vee a_2}\max _{1\le j\le d_n}{\mathcal {M}}(f_{n,j})\rightarrow 0$ as $n\rightarrow \infty $ and $q_1=q_2=\cdots $.

Then, we have $\sup _{A\in {\mathcal {A}}^\mathrm {re}(d_n)}|P(\varvec{Q}^{(n)}(\varvec{X})\in A)-P(Z^{(n)}\in A)|\rightarrow 0$ as $n\rightarrow \infty $.

Our second main result gives high-dimensional versions of fourth-moment theorems, universality results and Peccati–Tudor-type theorems for homogeneous sums:

Theorem 2.2

Let us keep the same notation as in Corollary 2.2. Suppose that one of the following conditions is satisfied:

(A)
$\varvec{X}$ is a sequence of independent copies of a random variable X such that $\Vert X\Vert _{\psi _\alpha }<\infty $ for some $\alpha >0$ and $\mathrm {E}[X^3]=0$ and $\mathrm {E}[X^4]\ge 3$.
(B)
For every i, $X_i$ is a standardized Poisson random variable with intensity $\lambda _i>0$, i.e., $\lambda _i+\sqrt{\lambda _i}X_i$ is a Poisson random variable with intensity $\lambda _i$. Moreover, $\inf _{i\in {\mathbb {N}}}\lambda _i>0$.
(C)
For every i, $X_i$ is a standardized gamma random variable with shape $\nu _i>0$ and unit rate, i.e., $\nu _i+\sqrt{\nu _i}X_i\sim \gamma (\nu _i)$. Moreover, $\inf _{i\in {\mathbb {N}}}\nu _i>0$.

Suppose also $2\le \inf _{j\in {\mathbb {N}}}q_j\le \sup _{j\in {\mathbb {N}}}q_j<\infty $, $0<\inf _{n\in {\mathbb {N}}}\min _{1\le j\le d_n}\mathfrak {C}^{(n)}_{jj}\le \sup _{n\in {\mathbb {N}}}\max _{1\le j\le d_n}\mathfrak {C}^{(n)}_{jj}<\infty $ and

$$\begin{aligned} (\log d_n)^a\max _{1\le j,k\le d_n}|\mathrm {E}[Q(f_{n,j};\varvec{X})Q(f_{n,k};\varvec{X})]-\mathfrak {C}_{n,jk}|\rightarrow 0 \end{aligned}$$

as $n\rightarrow \infty $ for every $a>0$. Then, we have $\kappa _4(Q(f;\varvec{X}))\ge 0$ for any symmetric function $f:[N]^q\rightarrow {\mathbb {R}}$ vanishing on diagonals. Moreover, the following conditions are equivalent:

(i)
$(\log d_n)^a\max _{1\le j\le d_n}\kappa _4(Q(f_{n,j};\varvec{X}))\rightarrow 0$ as $n\rightarrow \infty $ for every $a>0$.
(ii)
$(\log d_n)^a\max _{1\le j\le d_n}\sup _{x\in {\mathbb {R}}}|P(Q(f_{n,j};\varvec{X})\le x)-P(Z_{n,j}\le x)|\rightarrow 0$ as $n\rightarrow \infty $ for every $a>0$.
(iii)
$(\log d_n)^a\sup _{x\in {\mathbb {R}}^{d_n}}|P(\varvec{Q}^{(n)}(\varvec{X})\le x)-P(Z^{(n)}\le x)|\rightarrow 0$ as $n\rightarrow \infty $ for every $a>0$.
(iv)
$(\log d_n)^a\sup _{x\in {\mathbb {R}}^{d_n}}|P(\varvec{Q}^{(n)}(\varvec{Y})\le x)-P(Z^{(n)}\le x)|\rightarrow 0$ as $n\rightarrow \infty $ for any $a>0$ and sequence $\varvec{Y}=(Y_i)_{i\in {\mathbb {N}}}$ of centered independent variables with unit variance such that $\sup _{i\in {\mathbb {N}}}\Vert Y_i\Vert _{\psi _\alpha }<\infty $ for some $\alpha >0$.

Remark 2.2

(a)
The implications (i) $\Rightarrow $ (iii), (iii) $\Rightarrow $ (iv) and (ii) $\Rightarrow $ (iii) can be viewed as high-dimensional versions of fourth-moment theorems, universality results and Peccati–Tudor-type theorems for homogeneous sums, respectively. Here, Peccati–Tudor-type theorems refer to statements such that a joint CLT is implied by component-wise CLTs (Peccati & Tudor [51] have established such a result for multiple Wiener–Itô integrals with respect to an isonormal Gaussian process).
(b)
The proof of Theorem 2.2 relies on the fact that condition (i) automatically yields $(\log d_n)^a\max _{j}\mathcal {M}(f_{n,j})\rightarrow 0$ as $n\rightarrow \infty $ for every $a>0$. On the one hand, this fact has already been established in the previous work for cases (A) and (B) (see the proof of Lemma 7.2). On the other hand, for case (C), this fact seems not to have appeared in the literature so far. Indeed, for case (C) we obtain it as a byproduct of the proof of Proposition 5.2 (see Lemma 5.4). As a consequence, Theorem 2.2 seems new for case (C) even in the fixed-dimensional case. We remark that the fourth-moment theorem for case (C) has been established by [1] in the univariate case, which inspired our discussions in Sect. 5 (see also [9]).

3.1 Comparison of Theorem 2.1 to Some Existing Results

3.1.1 Comparison to Corollary 7.3 in Nourdin, Peccati and Reinert [47]

First, we compare our result to the quantitative multi-dimensional CLT for homogeneous sums obtained in Nourdin et al. [47]. To state their result, we need to introduce the notion of contraction, which will also play an important role in Sect. 5.2. For two symmetric functions $f:[N]^p\rightarrow {\mathbb {R}},g:[N]^q\rightarrow {\mathbb {R}}$ and $r\in \{0,1\dots ,p\wedge q\}$, we define the contraction $f\star _rg:[N]^{p+q-2r}\rightarrow {\mathbb {R}}$ by

$$\begin{aligned} f\star _rg(i_1,\dots ,i_{p+q-2r})= & {} \sum _{k_1,\dots ,k_r=1}^Nf(i_1,\dots ,i_{p-r},k_1,\dots ,k_{r})g\nonumber \\&(i_{p-r+1},\dots ,i_{p+q-2r},k_1,\dots ,k_r). \end{aligned}$$

(2.3)

In particular, we have

$$\begin{aligned} f\star _0g(i_1,\dots ,i_{p+q})=f\otimes g(i_1,\dots ,i_{p+q})=f(i_1,\dots ,i_p)g(i_{p+1},\dots ,i_{p+q}). \end{aligned}$$

Now we are ready to state the result of [47]. To simplify the notation, we focus only on the identity covariance matrix case and do not keep the explicit dependence of constants on $q_j$’s.

Proposition 2.1

(Nourdin et al. [47], Corollary 7.3) Let us keep the same notation as in Theorem 2.1. Suppose that $\mathfrak {C}_{jk}=\mathrm {E}[Q(f_j;\varvec{X})Q(f_k;\varvec{X})]$ for all $j,k\in [d]$ and $\mathfrak {C}$ is the identity matrix of size d. Suppose also that $\beta :=\max _{1\le i\le N}\mathrm {E}[|X_i|^3]<\infty $ and $q_d\ge \cdots \ge q_1\ge 2$. Then, we have

$$\begin{aligned}&\sup _{A\in {\mathcal {C}}({\mathbb {R}}^{d})}|P(\varvec{Q}(\varvec{X})\in A)-P(Z\in A)|\nonumber \\&\quad \le Kd^{3/8}\left\{ \overline{\Delta }+{\mathsf {C}}(\beta +1)\left( \sum _{j=1}^d\beta ^{(q_j-1)/3}\right) ^3\sqrt{\max _{1\le j\le d}\mathcal {M}(f_j)}\right\} ^{1/4}, \end{aligned}$$

(2.4)

where ${\mathcal {C}}({\mathbb {R}}^{d})$ is the set of all convex Borel subsets of ${\mathbb {R}}^d$, $K>0$ is a constant depending only on $\overline{q}_d$, ${\mathsf {C}}:=\sum _{i=1}^N\max _{1\le j\le d}{{\,\mathrm{Inf}\,}}_i(f_j)$ and

$$\begin{aligned} \overline{\Delta }:=\sum _{1\le j\le k\le d}\left( \sum _{r=1}^{q_j-1}(\Vert f_j\star _{q_j-r}f_j\Vert _{\ell _2}+\Vert f_k\star _{q_k-r}f_k\Vert _{\ell _2})+1_{\{q_j<q_k\}}\sqrt{\Vert f_k\star _{q_k-q_j}f_k\Vert _{\ell _2}}\right) . \end{aligned}$$

To compare Proposition 2.1 to our result, we need to bound the quantity $\overline{\Delta }$ by $|\kappa _4(Q(f_j;\varvec{X}))|$ and $\mathcal {M}(f_j)$, $j\in [d]$. This can be carried out by the following lemma (proved in Sect. 7.4):

Lemma 2.1

Let $\varvec{X}=(X_i)_{i=1}^N$ be a sequence of independent centered random variables with unit variance and such that $M:=1+\max _{1\le i\le N}\mathrm {E}[X_i^4]<\infty $. Also, let $q\ge 2$ be an integer and $f:[N]^q\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals. Then, we have

$$\begin{aligned} \max _{1\le r\le q-1}\Vert f\star _r f\Vert _{\ell _2}\le \sqrt{|\kappa _4(Q(f;\varvec{X}))|+CM\Vert f\Vert _{\ell _2}^2\mathcal {M}(f)}, \end{aligned}$$

where $C>0$ depends only on q.

Remark 2.3

The bound in Lemma 2.1 is generally sharp. In fact, it is well known that $\sqrt{|\kappa _4(Q(f;\varvec{X}))|}$ has the same order as $\max _{1\le r\le q-1}\Vert f\star _r f\Vert _{\ell _2}$ if $\varvec{X}$ is Gaussian (see, e.g., Eq.(5.2.6) in [44]). Moreover, if $q=2$ and $f(i,j)=N^{-1/2}1_{\{|i-j|=1\}}$, then both $\Vert f\star _1f\Vert _{\ell _2}$ and $\Vert f\Vert _{\ell _2}\sqrt{\mathcal {M}(f)}$ are of order $N^{-1/2}$.

With the help of Lemma 2.1, we observe that the bound in (2.4) typically has the same order as

$$\begin{aligned} d^{3/8}\left\{ d^2\max _{1\le j,k\le d}\widehat{\Delta }_{jk}+d^3{\mathsf {C}}\max _{1\le j\le d}\sqrt{\mathcal {M}(f_j)}\right\} ^{1/4}, \end{aligned}$$

where

$$\begin{aligned} \widehat{\Delta }_{jk}:= & {} 1_{\{q_j=q_k\}}\sqrt{|\kappa _4(Q(f_j;\varvec{X}))|+\mathcal {M}(f_j)}\\&+1_{\{q_j<q_k\}}\left\{ |\kappa _4(Q(f_k;\varvec{X}))|+\mathcal {M}(f_k)\right\} ^{1/4}. \end{aligned}$$

Thus, in the bound of (2.4), the dimension appears as a power of d, while the exponent of the “standard” convergence rate $\delta :=\max _{1\le j\le d}\sqrt{|\kappa _4(Q(f_j;\varvec{X}))|+\mathcal {M}(f_j)}$ is 1/4. These are much improved in our result because the former appears as a power of $\log d$ and the latter is 1/3. Nevertheless, we should note that the bound in (2.4) is given for the much stronger metric than the Kolmogorov distance. In fact, to the best of the author’s knowledge, all the known bounds for this metric depend polynomially on the dimension even for sums of independent random variables; see [58, Section 1.1] and references therein.

Remark 2.4

(a)
Roughly speaking, the exponent of $\delta $ is 1/4 in the bound of (2.4) because this bound is transferred from an analogous quantitative CLT for the Gaussian counterpart by the Lindeberg method with matching moments up to the second order. To overcome this issue, we need to match moments up to the third order and thus we can no longer rely on the result analogous to Theorem 2.1 for the Gaussian counterpart, which is obtained in [35]. For this reason, we will develop a high-dimensional CLT for homogeneous sums based on normal and gamma variables in Sect. 5.
(b)
It is worth noting that the quantity ${\mathsf {C}}=\sum _{i=1}^N\max _{1\le j\le d}{{\,\mathrm{Inf}\,}}_i(f_j)$ in the bound of (2.4) can be much larger than $\max _{1\le j\le d}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_j)=\max _{1\le j\le d}\Vert f_j\Vert _{\ell _2}^2$ in high-dimensional situations (see Remark 2.5 for a concrete example). Indeed, naïve application of the Lindeberg method produces a quantity like ${\mathsf {C}}$, which prevents us from using the Lindeberg method in its pure form (this is why Chernozhukov et al. [13, 18] rely on Stein’s method to prove their high-dimensional CLTs; see [14, Appendix L] for a detailed discussion). In Sect. 6, we will resolve this issue by randomizing the Lindeberg method as Deng & Zhang [24] have recently done in the context of sums of independent random vectors.

3.1.2 Comparison to Theorem 3.2 in Koike [35]

Next, we compare our result to the Gaussian approximation result for maxima of quadratic forms obtained in [35, Theorem 3.2]. Here, for an explicit comparison, we state this result with applying [35, Corollary 3.1]. For a function $f:[N]^2\rightarrow {\mathbb {R}}$, we denote the $N\times N$ matrix $(f(i,j))_{1\le i,j\le N}$ by [f].

Proposition 2.2

(Koike [35], Theorem 3.2 and Corollary 3.1) Let us keep the same notation as in Corollary 2.2. Suppose that $\inf _{n\in {\mathbb {N}}}\min _{1\le k\le d_n}\Vert Z_{n,k}\Vert _2>0$, $q_j=2$ for all j, $\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _2}<\infty $ and (2.2) holds true as $n\rightarrow \infty $. Suppose also that

$$\begin{aligned}&(\log d_n)^3\max _{1\le k\le d_n}\sqrt{{{\,\mathrm{tr}\,}}\left( [f_{n,k}]^4\right) }\nonumber \\&\qquad +(\log d_n)^5\left( \max _{1\le k\le d_n}\sqrt{\mathcal {M}(f_{n,k})}\right) \sum _{i=1}^{N_n}\max _{1\le k\le d_n}{{\,\mathrm{Inf}\,}}_i(f_{n,k})\rightarrow 0 \end{aligned}$$

(2.5)

as $n\rightarrow \infty $. Then, we have

$$\begin{aligned} \sup _{t\in {\mathbb {R}}}\left| P\left( \max _{1\le k\le d_n}|Q(f_{n,k};\varvec{X})|\le t\right) -P\left( \max _{1\le k\le d_n}|Z_{n,k}|\le t\right) \right| \rightarrow 0 \end{aligned}$$

(2.6)

as $n\rightarrow \infty $.

When we apply our result to quadratic forms as above, we obtain the following result.

Proposition 2.3

Let us keep the same notation as in Corollary 2.2. Set

$$\begin{aligned} \Delta _n:=\max _{1\le k\le d_n}\sqrt{{{\,\mathrm{tr}\,}}\left( [f_{n,k}]^4\right) }+\max _{1\le k\le d_n}\sqrt{\mathcal {M}(f_{n,k})}\Vert f_{n,k}\Vert _{\ell _2} \end{aligned}$$

for every n. Assume $q_1=q_2=\cdots =2$, $\inf _{n\in {\mathbb {N}}}\min _{1\le k\le d_n}\Vert Z_{n,k}\Vert _2>0$ and (2.2). Assume also that either one of the following conditions is satisfied:

(i)
$\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _1}<\infty $ and $(\log d_n)^5\Delta _n\rightarrow 0$ as $n\rightarrow \infty $.
(ii)
$\sup _{i\in {\mathbb {N}}}\Vert X_i\Vert _{\psi _2}<\infty $, $\mathrm {E}[X_i^3]=0$ for all i and $(\log d_n)^3\Delta _n\rightarrow 0$ as $n\rightarrow \infty $.

Then, we have $\sup _{A\in {\mathcal {A}}^\mathrm {re}(d_n)}|P(\varvec{Q}^{(n)}(\varvec{X})\in A)-P(Z^{(n)}\in A)|\rightarrow 0$ as $n\rightarrow \infty $.

Remark 2.5

(a) Regarding the convergence rate of $\max _{1\le k\le d_n}{{\,\mathrm{tr}\,}}\left( [f_{n,k}]^4\right) $, condition (i) in Proposition 2.3 is stronger than the one in Proposition 2.2. However, the former imposes a weaker moment condition on $\varvec{X}$ than the latter. More importantly, the second term of $\Delta _n$ is always smaller than or equal to the second term in (2.5), and the latter can be much larger than the former. For example, let us assume $N_n=d_n=n$ and consider the functions $f_{n,k}$ defined as follows:

$$\begin{aligned} f_{n,k}(i,j)=\left\{ \begin{array}{ll} n^{-1/2} &{} \text {if }|i-j|=1,i\ne k,j\ne k,\\ n^{-1/4} &{} \text {if }|i-j|=1,i=k\text { or }j=k,\\ 0 &{} \text {otherwise}. \end{array} \right. \end{aligned}$$

Then, we have ${{\,\mathrm{Inf}\,}}_i(f_{n,k})=(1+1_{\{1<i<n\}})n^{-1/2}$ if $i\in \{k,k\pm 1\}$ and ${{\,\mathrm{Inf}\,}}_i(f_{n,k})=(1+1_{\{1<i<n\}})n^{-1}$ otherwise. Therefore, on the one hand

$$\begin{aligned} \left( \max _{1\le k\le d_n}\sqrt{\mathcal {M}(f_{n,k})}\right) \sum _{i=1}^{N_n}\max _{1\le k\le d_n}{{\,\mathrm{Inf}\,}}_i(f_{n,k}) \end{aligned}$$

does not converge to 0 as $n\rightarrow \infty $, but on the other hand $\max _{1\le k\le d_n}\sqrt{\mathcal {M}(f_{n,k})}\Vert f_{n,k}\Vert _{\ell _2}=O(n^{-1/4})$ as $n\rightarrow \infty $. Note that in this case we have $\max _{1\le k\le d_n}\sqrt{{{\,\mathrm{tr}\,}}\left( [f_{n,k}]^4\right) }=O(n^{-1/4})$ and $\Vert f_{n,k}\Vert _{\ell _2}\rightarrow 1$ as $n\rightarrow \infty $, so (2.6) holds true due to Proposition 2.3.

(b) Condition (ii) in Proposition 2.3 requires the additional zero skewness assumption, but it always improves the assumption on the functions $f_{n,k}$ than the one in Proposition 2.2.

(c) We have $\Delta _n\le 2\max _{k}\Vert [f_{n,k}]\Vert _{\mathrm {sp}}\Vert f_{n,k}\Vert _{\ell _2}$ with $\Vert \cdot \Vert _{\mathrm {sp}}$ the spectral norm of matrices. So $(\log d_n)^a\Delta _n\rightarrow 0$ for some $a>0$ is implied by $(\log d_n)^a\max _{k}\Vert [f_{n,k}]\Vert _{\mathrm {sp}}\Vert f_{n,k}\Vert _{\ell _2}\rightarrow 0$.

3.2 Statistical Application: Bootstrap Test for the Absence of Lead–Lag Relationship

Let $W_t=(W_t^1,W_t^2)$ $(t\in {\mathbb {R}})$ be a two-sided bivariate standard Wiener process. Also let $\rho \in (-1,1)$ and $\vartheta \in {\mathbb {R}}$ be two (unknown) parameters. We define the bivariate process $B_t=(B_t^1,B_t^2)$ $(t\in {\mathbb {R}})$ as $B_t^1=W^1_t$ and $B_t^2=\rho W^1_{t-\vartheta }+\sqrt{1-\rho ^2}W^2_t$. For each $\nu =1,2$, we consider the process $X^\nu =(X^\nu _t)_{t\ge 0}$ given by

$$\begin{aligned} X^\nu _t=X^\nu _0+\int _0^t\sigma _\nu (s)\mathrm{d}B^\nu _s,\qquad t\ge 0, \end{aligned}$$

(2.7)

where $\sigma _\nu \in L^2(0,\infty )$ is nonnegative-valued and deterministic. If $\rho \ne 0$, there is a correlation between $X^1$ and $X^2$ with a time lag of $\vartheta $. We aim to test for whether such a correlation really exists or not, given (possibly asynchronous) high-frequency observations of $X^1$ and $X^2$. Specifically, for each $\nu =1,2$, we observe the process $X^\nu $ on the interval [0, T] at the deterministic sampling times $0\le t^\nu _0<t^\nu _1<\cdots <t^\nu _{n_\nu }\le T$, which implicitly depend on the parameter $n\in {\mathbb {N}}$ such that

$$\begin{aligned} r_n:=\max _{\nu =1,2}\max _{i=0,1,\dots ,n_\nu +1}(t^\nu _i-t^\nu _{i-1})\rightarrow 0 \end{aligned}$$

as $n\rightarrow \infty $, where we set $t^\nu _{-1}:=0$ and $t^\nu _{n_\nu +1}:=T$ for each $\nu =1,2$. To test for the null hypothesis $H_0:\rho =0$ against the alternative $H_1:\rho \ne 0$, Koike [35] proposed the test statistic given by $T_n=\sqrt{n}\max _{\theta \in {\mathcal {G}}_n}|U_n(\theta )|$, where ${\mathcal {G}}_n$ is a finite subset of ${\mathbb {R}}$ and

$$\begin{aligned} U_n(\theta )= & {} \sum _{i=1}^{n_1}\sum _{j=1}^{n_2}\Delta ^n_iX^1\Delta ^n_jX^2K^{ij}_\theta \qquad \text {with }\Delta ^n_iX^\nu =X^\nu _{t^\nu _i}-X^\nu _{t^\nu _{i-1}}\text { and }\\ K^{ij}_\theta= & {} 1_{\{(t^1_{i-1},t^1_i]\cap (t^2_{j-1}-\theta ,t^2_j-\theta ]\ne \emptyset \}}. \end{aligned}$$

The null distribution of $T_n$ can be approximated by its Gaussian analog as follows:

Proposition 2.4

([35], Proposition 4.1) For each $n\in {\mathbb {N}}$, let $(Z_n(\theta ))_{\theta \in {\mathcal {G}}_n}$ be a family of centered Gaussian variables such that $\mathrm {E}[Z_n(\theta )Z_n(\theta ')]=n{{\,\mathrm{Cov}\,}}[U_n(\theta ),U_n(\theta ')]$ for all $\theta ,\theta '\in {\mathcal {G}}_n$. Suppose that $\sup _{t\in [0,T]}(\sigma _1(t)+\sigma _2(t))<\infty $ and there are positive constants ${\underline{v}},{\overline{v}}$ such that

$$\begin{aligned} {\underline{v}}\le n\sum _{i=1}^{n_1}\sum _{j=1}^{n_2}\left( \int _{t_{i-1}^1}^{t_i^1}\sigma _1(t)^2\mathrm{d}t\right) \left( \int _{t_{j-1}^2}^{t_j^2}\sigma _2(t)^2\mathrm{d}t\right) K^{ij}_\theta \le {\overline{v}} \end{aligned}$$

for all $n\in {\mathbb {N}}$ and $\theta \in {\mathcal {G}}_n$. Then, under the null hypothesis $\rho =0$, we have

$$\begin{aligned} \sup _{x\in {\mathbb {R}}}\left| P\left( T_n\le x\right) -P\left( \max _{\theta \in {\mathcal {G}}_n}|Z_n(\theta )|\le x\right) \right| \rightarrow 0 \end{aligned}$$

as $n\rightarrow \infty $, provided that $nr_n^2\log ^6(\#{\mathcal {G}}_n)\rightarrow 0$.

Since the distribution of $\max _{\theta \in {\mathcal {G}}_n}|Z_n(\theta )|$ is analytically intractable, Koike [35] proposed a wild bootstrap procedure to approximate it. Formally, let $(w^1_i)_{i=1}^\infty $ and $(w^2_j)_{j=1}^\infty $ be mutually independent sequences of i.i.d. random variables independent of $X^1$ and $X^2$. Assume that $\mathrm {E}[w^1_1]=\mathrm {E}[w^2_1]=0$, ${{\,\mathrm{Var}\,}}[w^1_1]={{\,\mathrm{Var}\,}}[w^2_1]=1$ and $\Vert w^1_1\Vert _{\psi _2}\vee \Vert w^2_1\Vert _{\psi _2}<\infty $. Define the bootstrapped test statistic as $T_n^*=\sqrt{n}\max _{\theta \in {\mathcal {G}}_n}|U_n^*(\theta )|$ where

$$\begin{aligned} U_n^*(\theta )= \sum _{i=1}^{n_1}\sum _{j=1}^{n_2}\left( w^1_i\Delta ^n_iX^1\right) \left( w^2_j\Delta ^n_jX^2\right) K^{ij}_\theta . \end{aligned}$$

In [35, Proposition B.8], it is shown that

$$\begin{aligned} \sup _{x\in {\mathbb {R}}}\left| P\left( T_n^*\le x\mid X\right) -P\left( \max _{\theta \in {\mathcal {G}}_n}|Z_n(\theta )|\le x\right) \right| \rightarrow ^p0 \end{aligned}$$

(2.8)

as $n\rightarrow \infty $, provided that $r_n=O(n^{-3/4-\eta })$ and $\#{\mathcal {G}}_n=O(n^\gamma )$ for some $\eta ,\gamma >0$ in addition to the assumptions of Proposition 2.4. Our result allows us to relax the condition on $r_n$ as follows:

Proposition 2.5

Under the assumptions of Proposition 2.4, we have (2.8) as $n\rightarrow \infty $, provided that $r_n=O(n^{-1/2-\eta })$ and $\#{\mathcal {G}}_n=O(n^\gamma )$ for some $\eta ,\gamma >0$.

4 Chernozhukov–Chetverikov–Kato Theory

In this section we demonstrate a basic scheme of the CCK theory to establish high-dimensional CLTs. One main ingredient of the CCK theory is the following smooth approximation of the maximum function: For each $\beta >0$, we define the function $\Phi _\beta :{\mathbb {R}}^d\rightarrow {\mathbb {R}}$ by

$$\begin{aligned} \Phi _\beta (x)=\beta ^{-1}\log \left( \sum _{j=1}^de^{\beta x_j}\right) ,\qquad x=(x_1,\dots ,x_d)\in {\mathbb {R}}^d. \end{aligned}$$

Eq.(1) in [16] states that

$$\begin{aligned} 0\le \Phi _\beta (x)-\max _{1\le j\le d}x_j\le \beta ^{-1}\log d \end{aligned}$$

(3.1)

for any $x\in {\mathbb {R}}^d$. Therefore, the larger $\beta $ is, the better $\Phi _\beta $ approximates the maximum function. The next lemma, which is a summary of [24, Lemmas 5–6], highlights the key properties of this smooth max function:

Lemma 3.1

For any $\beta >0$, $m\in {\mathbb {N}}$ and $C^m$ function $h:{\mathbb {R}}\rightarrow {\mathbb {R}}$, there is an ${\mathbb {R}}^{\otimes m}$-valued function $\Upsilon _{\beta }(x)=(\Upsilon ^{j_1,\dots ,j_m}_\beta (x))_{1\le j_1,\dots ,j_m\le d}$ on ${\mathbb {R}}^d$ satisfying the following conditions:

(i)
For any $x\in {\mathbb {R}}^d$ and $j_1,\dots ,j_m\in [d]$, we have $ |\partial _{j_1\dots j_m}(h\circ \Phi _\beta )(x)|\le \Upsilon _\beta ^{j_1,\dots ,j_m}(x). $
(ii)
For every $x\in {\mathbb {R}}^d$, we have
$$\begin{aligned} \sum _{j_1,\dots ,j_m=1}^d\Upsilon _\beta ^{j_1,\dots , j_m}(x) \le c_{m}\max _{1\le k\le m}\beta ^{m-k}\Vert h^{(k)}\Vert _\infty , \end{aligned}$$
where $c_{m}>0$ depends only on m.
(iii)
For any $x,t\in {\mathbb {R}}^d$ and $j_1,\dots ,j_m\in [d]$, we have
$$\begin{aligned} e^{-8\Vert t\Vert _{\ell _\infty }\beta }\Upsilon _\beta ^{j_1,\dots ,j_m}(x+t)\le \Upsilon _\beta ^{j_1,\dots ,j_m}(x)\le e^{8\Vert t\Vert _{\ell _\infty }\beta }\Upsilon _\beta ^{j_1,\dots ,j_m}(x+t). \end{aligned}$$

Remark 3.1

An explicit expression of the constant $c_m$ in Lemma 3.1 can be derived from [24, Lemma 5]. In particular, we have $c_1=1$ and $c_2=3$.

Another important ingredient of the CCK theory is the so-called anti-concentration inequality. For our purpose, the following one is particularly useful (see [19] for the proof):

Lemma 3.2

(Nazarov’s inequality) If $\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0$, for any $x\in {\mathbb {R}}^d$ and $\varepsilon >0$ we have

$$\begin{aligned} P(Z\le x+\varepsilon )-P(Z\le x)\le \frac{\varepsilon }{\underline{\sigma }}\left( \sqrt{2\log d}+2\right) . \end{aligned}$$

These tools enable us to establish the following form of smoothing inequality:

Proposition 3.1

Let $g_{0}:{\mathbb {R}}\rightarrow [0,1]$ be a measurable function such that $g_{0}(t)=1$ for $t\le 0$ and $g_{0}(t)=0$ for $t\ge 1$. Also, let $\varepsilon >0$ and set $\beta :=\varepsilon ^{-1}\log d$. Suppose that $\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0$. Then, for any d-dimensional random vector F, we have

$$\begin{aligned} \sup _{x\in {\mathbb {R}}^d}\left| P(F\le x)-P(Z\le x)\right| \le \Delta _\varepsilon (F,Z)+\frac{2\varepsilon }{\underline{\sigma }}\left( \sqrt{2\log d}+2\right) , \end{aligned}$$

(3.2)

where

$$\begin{aligned} \Delta _\varepsilon (F,Z):=\sup _{y\in {\mathbb {R}}^d}\left| \mathrm {E}[g_0(\varepsilon ^{-1}\Phi _\beta (F-y))]-\mathrm {E}[g_0(\varepsilon ^{-1}\Phi _\beta (Z-y))]\right| . \end{aligned}$$

(3.3)

Proof

This result has been essentially shown in Step 2 in the proof of [18, Lemma 5.1]. $\square $

Remark 3.2

Proposition 3.1 can be seen as a special version of more general smoothing inequalities such as [7, Lemma 2.1]. An important feature of bound (3.2) is that the quantity $\Delta _\varepsilon (F,Z)$ contains only test functions of the form $x\mapsto g_0(\Phi _\beta (x-y))$ for some $y\in {\mathbb {R}}^d$. If $g_0$ is sufficiently smooth, derivatives of such a test function admit good estimates with respect to the dimension d, as seen from Lemma 3.1.

It might be worth mentioning that we can use Proposition 3.1 to derive a bound for the Kolmogorov distance by the Wasserstein distance. Let us recall the definition of the Wasserstein distance.

Definition 3.1

(Wasserstein distance) For d-dimensional random vectors F, G with integrable components, the Wasserstein distance between the laws of F and G is defined by

$$\begin{aligned} {\mathcal {W}}_1(F,G):=\sup _{h\in \mathcal {H}}|\mathrm {E}[h(F)]-\mathrm {E}[h(G)]|, \end{aligned}$$

where $\mathcal {H}$ denotes the set of all functions $h:{\mathbb {R}}^d\rightarrow {\mathbb {R}}$ such that

$$\begin{aligned} \Vert h\Vert _{\mathrm {Lip}}:=\sup _{x,y\in {\mathbb {R}}^d:x\ne y}\frac{|h(x)-h(y)|}{\Vert x-y\Vert }\le 1. \end{aligned}$$

Here, $\Vert \cdot \Vert $ is the usual Euclidian norm on ${\mathbb {R}}^d$.

Corollary 3.1

Under the assumptions of Proposition 3.1, we have

$$\begin{aligned} \sup _{x\in {\mathbb {R}}^d}\left| P(F\le x)-P(Z\le x)\right| \le \sqrt{\frac{2\left( \sqrt{2\log d}+2\right) }{\underline{\sigma }}{\mathcal {W}}_1(F,Z)}. \end{aligned}$$

Proof

It suffices to consider the case ${\mathcal {W}}_1(F,Z)>0$. Let us define the function $g_0:{\mathbb {R}}\rightarrow [0,1]$ by $g_0(x)=\min \{1,\max \{1-x,0\}\}$, $x\in {\mathbb {R}}$. Then, for any $x,x',y\in {\mathbb {R}}^d$ and $\varepsilon >0$, we have $|g_0(\varepsilon ^{-1}\Phi _\beta (x-y)-g_0(\varepsilon ^{-1}\Phi _\beta (x'-y)|\le \varepsilon ^{-1}\Vert x-x'\Vert _{\ell _\infty }$ by [13, Lemma A.3], so we obtain $\Delta _\varepsilon (F,Z)\le \varepsilon ^{-1}{\mathcal {W}}_1(F,Z)$. Now, setting $\varepsilon =\sqrt{\underline{\sigma }{\mathcal {W}}_1(F,Z)/(2\sqrt{2\log d}+4)}$, we infer the desired result from Proposition 3.1. $\square $

When $d=1$, Corollary 3.1 recovers the standard estimate (cf. Eq.(C.2.6) in [44]). We shall remark that a bound similar to the above (with a slightly different constant) has already appeared in [4, Theorem 3.1].

Remark 3.3

It is generally impossible to derive (1.2)-type bounds from the corresponding ones for the Wasserstein distance. To see this, let $F=(F_1,\dots ,F_d)$ be a d-dimensional random vector such that the laws of $F_1,\dots ,F_d$ are identical (and integrable). Also, let $G=(G_1,\dots ,G_d)$ be another d-dimensional random vector satisfying the same condition. Then, we can easily verify ${\mathcal {W}}_1(F,G)\ge \sqrt{d}{\mathcal {W}}_1(F_1,G_1)$ by definition.

5 Stein Kernels and High-Dimensional CLTs

In the rest of the paper, we fix a $C^\infty $ function $g_0:{\mathbb {R}}\rightarrow [0,1]$ such that $g_0(t)=1$ for $t\le 0$ and $g_0(t)=0$ for $t\ge 1$: For example, we can take it as $g_0(t)=f_0(1-t)/\{f_0(t)+f_0(1-t)\}$, where the function $f_0:{\mathbb {R}}\rightarrow {\mathbb {R}}$ is defined by $f_0(t)=e^{-1/t}$ if $t>0$ and $f_0(t)=0$ otherwise.

To make Proposition 3.1 useful, we need to obtain a “good” upper bound for the quantity $\Delta _\varepsilon (F,Z)$. As briefly mentioned in Remark 2.4, Chernozhukov et al. [13] have pointed out that Stein’s method effectively solves this task. Moreover, discussions in [16, 35] implicitly suggest that the CCK theory would have a nice connection to Stein kernels. In this section, we illustrate this idea.

Definition 4.1

(Stein kernel) Let $F=(F_1,\dots ,F_d)$ be a centered d-dimensional random variable. A $d\times d$ matrix-valued measurable function $\tau _F=(\tau _F^{ij})_{1\le i,j\le d}$ on ${\mathbb {R}}^d$ is called a Stein kernel for (the law of) F if $\max _{1\le i,j\le d}\mathrm {E}[|\tau _F^{ij}(F)|]<\infty $ and

$$\begin{aligned} \sum _{j=1}^d\mathrm {E}[\partial _j\varphi (F)F_j]=\sum _{i,j=1}^d\mathrm {E}[\partial _{ij}\varphi (F)\tau _F^{ij}(F)] \end{aligned}$$

(4.1)

for any $\varphi \in C^\infty _b({\mathbb {R}}^d)$.

Remark 4.1

In this paper, we adopt $C^\infty _b({\mathbb {R}}^d)$ as the class of test functions for which identity (4.1) holds true because of convenience, but other classes are also used in the literature; see [20] for instance.

Lemma 4.1

Let $F=(F_1,\dots ,F_d)$ be a centered d-dimensional random vector. Also, let $\tau _F=(\tau _F^{ij})_{1\le i,j\le d}$ be a Stein kernel for F. Then, we have

$$\begin{aligned} \sup _{y\in {\mathbb {R}}^d}\left| {{\,\mathrm{E}\,}}\left[ h\left( \Phi _\beta (F-y)\right) \right] -{{\,\mathrm{E}\,}}\left[ h\left( \Phi _\beta (Z-y)\right) \right] \right| \le \frac{3}{2}\max \{\Vert h''\Vert _\infty ,\beta \Vert h'\Vert _\infty \}\Delta \end{aligned}$$

for any $\beta >0$ and $h\in C^\infty _b({\mathbb {R}})$, where

$$\begin{aligned} \Delta :={{\,\mathrm{E}\,}}\left[ \max _{1\le i,j\le d}|\tau _F^{ij}(F)-{\mathfrak {C}}_{ij}|\right] . \end{aligned}$$

Proof

The proof is essentially same as that of [16, Theorem 1] or [35, Proposition 2.1], so we omit it. $\square $

Proposition 4.1

Suppose that $d\ge 2$ and $\underline{\sigma }:=\min _{1\le j\le d}\Vert Z_j\Vert _2>0$. Under the assumptions of Lemma 4.1, there is a universal constant $C>0$ such that

$$\begin{aligned} \sup _{x\in {\mathbb {R}}^d}\left| P(F\le x)-P(Z\le x)\right| \le C(1+\underline{\sigma }^{-1})\Delta ^{1/3}(\log d)^{2/3}. \end{aligned}$$

(4.2)

Proof

Thanks to [44, Lemma 4.1.3], it suffices to consider the case $\Delta >0$. By Lemma 4.1, for any $\varepsilon >0$ we have $\Delta _\varepsilon (F,Z)\le C'\varepsilon ^{-2}(\log d)\Delta $, where $C'>0$ is a universal constant. Therefore, Proposition 3.1 yields

$$\begin{aligned} \sup _{x\in {\mathbb {R}}^d}\left| P(F\le x)-P(Z\le x)\right| \le C'\varepsilon ^{-2}(\log d)\Delta +\frac{2\varepsilon }{\underline{\sigma }}\left( \sqrt{2\log d}+2\right) . \end{aligned}$$

Now, setting $\varepsilon =\Delta ^{1/3}(\log d)^{1/6}$, we obtain the desired result. $\square $

6 A High-Dimensional CLT for Normal-Gamma Homogeneous Sums

In view of the results in Sect. 4, we naturally seek a situation where a vector of homogeneous sums has a Stein kernel. This is the case when all the components are eigenfunctions of a Markov diffusion operator (cf. Proposition 5.1 in [39]). Moreover, as clarified in [1, 9, 38], only some spectral properties of the Markov diffusion operator are essential for deriving a fourth-moment-type bound for the variance of the corresponding Stein kernel. This spectral property is especially satisfied when each $X_i$ is either a Gaussian or (standardized) gamma variable, so this section focuses on such a situation and derives a high-dimensional CLT for this special case.

For each $\nu >0$, we denote by $\gamma _\pm (\nu )$ the distribution of the random variable $\pm (X-\nu )/\sqrt{\nu }$ with $X\sim \gamma (\nu )$. Also, for every $q\in {\mathbb {N}}$ we set $ \mathfrak {c}_q:=\sum _{r=1}^qr!\left( {\begin{array}{c}q\\ r\end{array}}\right) ^2. $

Proposition 5.1

Let us keep the same notation as in Theorem 2.1 and assume $d\ge 2$. Let $\varvec{Y}=(Y_i)_{i=1}^N$ be a sequence of independent random variables such that the law of $Y_i$ belongs to $\{{\mathcal {N}}(0,1)\}\cup \{\gamma _+(\nu ):\nu>0\}\cup \{\gamma _-(\nu ):\nu >0\}$ for all i. For every i, define the constants $v_i$ and $\eta _i$ by

$$\begin{aligned} v_i:= \left\{ \begin{array}{ll} 2 &{} \text {if }Y_i\sim {\mathcal {N}}(0,1), \\ 2(1+\nu ^{-1}) &{} \text {if }Y_i\sim \gamma _\pm (\nu ), \end{array} \right. \qquad \eta _i:= \left\{ \begin{array}{ll} 1 &{} \text {if }Y_i\sim {\mathcal {N}}(0,1), \\ 1\wedge \sqrt{\nu } &{} \text {if }Y_i\sim \gamma _\pm (\nu ). \end{array} \right. \end{aligned}$$

We also set $w_*=1/2$ if $Y_i\sim {\mathcal {N}}(0,1)$ for all i and $w_*=1$ otherwise. Then, $\kappa _4(Q(f_j;\varvec{Y}))\ge 0$ for all j and

$$\begin{aligned}&\sup _{y\in {\mathbb {R}}^d}\left| \mathrm {E}[h\left( \Phi _\beta (\varvec{Q}(\varvec{Y})-y)\right) ]-\mathrm {E}[h\left( \Phi _\beta (Z-y)\right) ]\right| \nonumber \\&\quad \le \frac{3}{2}\max \{\Vert h''\Vert _\infty ,\beta \Vert h'\Vert _\infty \}\left( \delta _0[\varvec{Q}(\varvec{Y})]+C\delta _2[\varvec{Q}(\varvec{Y})]\right) \end{aligned}$$

(5.1)

for any $\beta >0$ and $h\in C^\infty _b({\mathbb {R}})$, where $C>0$ depends only on $\overline{q}_d$ and

$$\begin{aligned} \delta _2[\varvec{Q}(\varvec{Y})]:= & {} \max _{1\le j,k\le d}\{\underline{\eta }_N^{-1}(\log d)\}^{w_*(q_j+q_k)-1}\left\{ 1_{\{q_j< q_k\}}\Vert Q(f_j;\varvec{Y})\Vert _4\kappa _4(Q(f_k;\varvec{Y}))^{1/4}\right. \\&\left. +1_{\{q_j=q_k\}}\sqrt{2\kappa _4(Q(f_j;\varvec{Y}))+\left( 2^{-q_j}\overline{v}_N^{q_j}-1\right) (2q_j)!\mathfrak {c}_{q_j} \sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_j)^2}\right\} \end{aligned}$$

with $\overline{v}_N:=\max _{1\le i\le N}v_i$ and $\underline{\eta }_N:=\min _{1\le i\le N}\eta _i$.

The rest of this section is devoted to the proof of Proposition 5.1. In the remainder of this section, we assume that the probability space $(\Omega ,{\mathcal {F}},P)$ is given by the product probability space $(\prod _{i=1}^N\Omega _i,\bigotimes _{i=1}^N{\mathcal {F}}_i,\bigotimes _{i=1}^NP_i)$, where

$$\begin{aligned} (\Omega _i,{\mathcal {F}}_i,P_i):= \left\{ \begin{array}{ll} ({\mathbb {R}},{\mathcal {B}}({\mathbb {R}}),{\mathcal {N}}(0,1)) &{} \text {if }Y_i\sim {\mathcal {N}}(0,1), \\ ((0,\infty ),{\mathcal {B}}((0,\infty )),\gamma (\nu )) &{} \text {if }Y_i\sim \gamma _\pm (\nu ). \end{array} \right. \end{aligned}$$

Then, we realize the variables $Y_1,\dots ,Y_N$ as follows: For $\omega =(\omega _1,\dots ,\omega _N)\in \Omega $, we define

$$\begin{aligned} Y_i(\omega ):= \left\{ \begin{array}{ll} \omega _i &{} \text {if }Y_i\sim {\mathcal {N}}(0,1), \\ \pm (\omega _i-\nu )/\sqrt{\nu } &{} \text {if }Y_i\sim \gamma _\pm (\nu ). \end{array} \right. \end{aligned}$$

6.1 $\Gamma $-Calculus

Our first aim is to construct a suitable Markov diffusion operator whose eigenspaces contain all the components of $\varvec{Q}(\varvec{Y})$. In the following, for an open subset U of ${\mathbb {R}}^m$, we write $C^\infty _p(U)$ for the set of all real-valued $C^\infty $ functions on U all of whose partial derivatives have at most polynomial growth.

First, we denote by ${{\,\mathrm{L}\,}}_\text {OU}$ the Ornstein–Uhlenbeck operator on ${\mathbb {R}}$. Next, for every $\nu >0$, we write ${{\,\mathrm{L}\,}}_\nu $ for the Laguerre operator on $(0,\infty )$ with parameter $\nu $. We then define the operators ${\mathcal {L}}_1,\dots ,{\mathcal {L}}_N$ by

$$\begin{aligned} {\mathcal {L}}_i:= \left\{ \begin{array}{ll} {{\,\mathrm{L}\,}}_\text {OU} &{} \text {if }Y_i\sim {\mathcal {N}}(0,1), \\ {{\,\mathrm{L}\,}}_\nu &{} \text {if }Y_i\sim \gamma _\pm (\nu ). \end{array} \right. \end{aligned}$$

Finally, we construct the densely defined symmetric operator ${{\,\mathrm{L}\,}}$ in $L^2(P)$ by tensorization of ${\mathcal {L}}_1,\dots ,{\mathcal {L}}_N$ (see Section 2.2 of [2] for details). We will use the following properties of ${{\,\mathrm{L}\,}}$ (cf. [1] and Section 2.2 of [2]):

(i)
If F and G are eigenfunctions of $-{{\,\mathrm{L}\,}}$ associated with eigenvalues p and q, respectively, FG belongs to $\bigoplus _{k=0}^{p+q}{{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}+k{{\,\mathrm{Id}\,}})$.
(ii)
The eigenspaces of ${{\,\mathrm{L}\,}}_\text {OU}$ and ${{\,\mathrm{L}\,}}_\nu $ associated with eigenvalue $k\in {\mathbb {Z}}_+$ are given by ${{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}_\text {OU}+k{{\,\mathrm{Id}\,}})=\{aH_k:a\in {\mathbb {R}}\}$ and ${{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}_\nu +k{{\,\mathrm{Id}\,}})=\{aL^{(\nu -1)}_k:a\in {\mathbb {R}}\}$, respectively. Here, $H_k$ and $L_k^{(\alpha )}$ denote the Hermite polynomial of degree k and Laguerre polynomial of degree k and parameter $\alpha >-1$, respectively.
(iii)
The eigenspace of ${{\,\mathrm{L}\,}}$ associated with eigenvalue k is given by
$$\begin{aligned} {{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}+k{{\,\mathrm{Id}\,}})=\bigoplus _{\begin{array}{c} k_1+\cdots +k_N=k\\ k_1,\dots ,k_N\in {\mathbb {Z}}_+ \end{array}} {{\,\mathrm{Ker}\,}}({\mathcal {L}}_1+k_1{{\,\mathrm{Id}\,}})\otimes \cdots \otimes {{\,\mathrm{Ker}\,}}({\mathcal {L}}_N+k_N{{\,\mathrm{Id}\,}}).\nonumber \\ \end{aligned}$$
(5.2)

Let us write ${\mathcal {S}}=C^\infty _p(\Omega )$. We define the carré du champ operator of ${{\,\mathrm{L}\,}}$ by

$$\begin{aligned} \Gamma (F,G)=\frac{1}{2}\left( {{\,\mathrm{L}\,}}(FG)-F{{\,\mathrm{L}\,}}G-G{{\,\mathrm{L}\,}}F\right) \end{aligned}$$

for all $F,G\in {\mathcal {S}}$. The following lemma is a special case of [39, Proposition 5.1].

Lemma 5.1

For every $(i,j)\in [d]^2$, define the function $\tau ^{ij}:{\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\otimes {\mathbb {R}}^d$ by

$$\begin{aligned} \tau ^{ij}(x)=\frac{1}{q_j}\mathrm {E}[\Gamma (Q(f_i;\varvec{Y}),Q(f_j;\varvec{Y}))\mid \varvec{Q}(\varvec{Y})=x],\qquad x\in {\mathbb {R}}^d. \end{aligned}$$

Then, $\tau =(\tau ^{ij})_{1\le i,j\le d}$ is a Stein kernel for $\varvec{Q}(\varvec{Y})$.

We refer to [5] for more details about these operators.

6.2 A Bound for the Variance of the Carré du Champ Operator

In view of Lemmas 4.1 and 5.1 , we obtain (5.1) once we show that

$$\begin{aligned} {{\,\mathrm{E}\,}}\left[ \max _{1\le j,k\le d}\left| \frac{1}{q_k}\Gamma \left( Q(f_j;\varvec{Y}),Q(f_k;\varvec{Y})\right) -\mathfrak {C}_{jk}\right| \right] \le \delta _0[\varvec{Q}(\varvec{Y})]+C\delta _2[\varvec{Q}(\varvec{Y})], \end{aligned}$$

(5.3)

where $C>0$ depends only on $\overline{q}_d$. As a first step, we estimate ${{\,\mathrm{Var}\,}}[\Gamma ( Q(f_j;\varvec{Y}),Q(f_k;\varvec{Y}))]$ for every $(j,k)\in [d]^2$. More precisely, our aim here is to prove the following result:

Proposition 5.2

Let $p\le q$ be two positive integers. Let $f:[N]^p\rightarrow {\mathbb {R}}$ and $g:[N]^q\rightarrow {\mathbb {R}}$ be symmetric functions vanishing on diagonals and set $F:=Q(f;\varvec{Y})$ and $G:=Q(g;\varvec{Y})$. Then, $\kappa _4(F)\ge 0$, $\kappa _4(G)\ge 0$ and

$$\begin{aligned}&{{\,\mathrm{Var}\,}}\left[ \frac{1}{q}\Gamma (F,G)\right] \nonumber \\&\quad \le 1_{\{p<q\}}\sqrt{\mathrm {E}[F^4]}\sqrt{\kappa _4(G)}\nonumber \\&\quad \quad +1_{\{p=q\}}\left\{ 2\sqrt{\kappa _4(F)}\sqrt{\kappa _4(G)} +\left( 2^{-p}\overline{v}_N^p-1\right) (2p)!\mathfrak {c}_p \sqrt{\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f)^2} \sqrt{\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(g)^2} \right\} .\nonumber \\ \end{aligned}$$

(5.4)

Before starting the proof, we remark how this result is related to the preceding studies. When $f=g$, Azmoodeh et al. [1] have derived a better estimate than (5.4) in a more general setting. Their technique of the proof can also be applied to the case $f\ne g$, and this has been implemented in Campese et al. [9]. However, this leads to a bound containing the quantity ${{\,\mathrm{Cov}\,}}[F^2,G^2]-2{{\,\mathrm{E}\,}}\left[ FG\right] ^2$, so we need an additional argument to estimate it. For this reason, we take an alternative route for the proof, which is inspired by the discussions in Zheng [59] as well as [9, Proposition 3.6]. As a byproduct of this strategy, we obtain inequality (5.11) which leads to the universality of gamma variables.

We begin by introducing some notation. We write $J_k$ for the orthogonal projection of $L^2(P)$ onto the eigenspace ${{\,\mathrm{Ker}\,}}({{\,\mathrm{L}\,}}+k{{\,\mathrm{Id}\,}})$. For every i, we define the random variable ${\mathfrak {p}}_{2}(Y_i)$ by

$$\begin{aligned} {\mathfrak {p}}_2(Y_i):= \left\{ \begin{array}{ll} H_2(Y_i) &{} \text {if }Y_i\sim {\mathcal {N}}(0,1), \\ \pm \frac{2}{\nu }L_2^{(\nu -1)}(\pm \sqrt{\nu }(Y_i+1)) &{} \text {if }Y_i\sim \gamma _\pm (\nu ). \end{array} \right. \end{aligned}$$

The following lemma can be proved by a straightforward computation.

Lemma 5.2

For every i, $\mathrm {E}[{\mathfrak {p}}_2(Y_i)]=\mathrm {E}[Y_i{\mathfrak {p}}_2(Y_i)]=0$ and $\mathrm {E}[\mathfrak {p}_2(Y_i)^2]=v_i$.

Next, given $h':[N]^r\rightarrow {\mathbb {R}}$, we define

$$\begin{aligned} \langle h,h'\rangle :=\sum _{i_1,\dots ,i_r=1}^Nh(i_1,\dots ,i_r)h'(i_1,\dots ,i_r). \end{aligned}$$

Note that $\Vert h\Vert _{\ell _2}^2=\langle h,h\rangle $. For every $r\in \{0,1,\dots ,p\wedge q\}$, we define the function $f\mathbin {\widehat{\star _{r}^0}}g:[N]^{p+q-r}\rightarrow {\mathbb {R}}$ by

$$\begin{aligned}&f\mathbin {\widehat{\star _{r}^0}}g(i_1,\dots ,i_{p+q-2r},k_1,\dots ,k_r)\\&\quad :=\frac{1}{(p+q-2r)!}\sum _{\sigma \in {\mathfrak {S}}_{p+q-2r}}f(i_{\sigma (1)},\dots ,i_{\sigma (p-r)},k_1,\dots ,k_{r})\\&\quad \qquad \times g(i_{\sigma (p-r+1)},\dots ,i_{\sigma (p+q-2r)},k_1,\dots ,k_r). \end{aligned}$$

Note that we have

$$\begin{aligned} \widetilde{f\star _rg}(i_1,\dots ,i_{p+q-2r})=\sum _{(k_1,\dots ,k_r)\in \Delta _r^N}f\mathbin {\widehat{\star _{r}^0}}g(i_1,\dots ,i_{p+q-2r},k_1,\dots ,k_r), \end{aligned}$$

(5.5)

where $\widetilde{f\star _rg}$ is the symmetrization of $f\star _rg$ (recall (2.3)). Finally, we set $\Delta _q^N:=\{(i_1,\dots ,i_q)\in [N]^q:i_j\ne i_k\text { if }j\ne k\}$.

The next lemma is a key part in our proof.

Lemma 5.3

Under the assumptions of Proposition 5.1, we have

$$\begin{aligned} \mathrm {E}[J_{p+q}(FG)^2]\ge (p+q)!\Vert f\mathbin {{\widetilde{\otimes }}}g\Vert _{\ell _2}^2. \end{aligned}$$

(5.6)

Moreover, if $p=q$, we have

$$\begin{aligned}&\left| \mathrm {E}[J_{2p}(F^2)J_{2p}(G^2)]-(2p)!\langle f\mathbin {{\widetilde{\otimes }}}f,g\mathbin {{\widetilde{\otimes }}}g\rangle \right| \nonumber \\&\quad \le \left( 2^{-p}\overline{v}_N^p-1\right) (2p)!\mathfrak {c}_p \sqrt{\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f)^2} \sqrt{\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(g)^2}. \end{aligned}$$

(5.7)

Proof

We can deduce from [48, Proposition 2.9] and (5.5)

$$\begin{aligned} J_{p+q}(FG)= & {} \sum _{r=0}^{p\wedge q}r!\left( {\begin{array}{c}p\\ r\end{array}}\right) \left( {\begin{array}{c}q\\ r\end{array}}\right) \sum _{(i_1,\dots ,i_{p+q-r})\in \Delta ^N_{p+q-r}}f\widehat{\star _r^0}g(i_{1},\dots ,i_{p+q-r})Y_{i_1}\nonumber \\&\cdots Y_{i_{p+q-2r}}{\mathfrak {p}}_2(Y_{i_{p+q-2r+1}}) \cdots {\mathfrak {p}}_2(Y_{i_{p+q-r}}). \end{aligned}$$

(5.8)

Next, let $(i_1,\dots ,i_{p+q-r})\in \Delta ^N_{p+q-r}$ and $(j_1,\dots ,j_{p+q-l})\in \Delta ^N_{p+q-l}$. Thanks to Lemma 5.2,

$$\begin{aligned}&\mathrm {E}[Y_{i_1}\cdots Y_{i_{p+q-2r}}{\mathfrak {p}}_2(Y_{i_{p+q-2r+1}})\cdots {\mathfrak {p}}_2(Y_{i_{p+q-r}})Y_{j_1}\\&\qquad \cdots Y_{j_{p+q-2l}}{\mathfrak {p}}_2(Y_{j_{p+q-2l+1}})\cdots {\mathfrak {p}}_2(Y_{j_{p+q-l}})] \end{aligned}$$

does not vanish if and only if the following condition is satisfied:

($\star $) $(i_1,\dots ,i_{p+q-2r})$ is a permutation of $(j_1,\dots ,j_{p+q-2l})$ and $(i_{p+q-2r+1},\dots ,i_{p+q-r})$ is a permutation of $(j_{p+q-2l+1},\dots ,j_{p+q-l})$.

Note that the condition ($\star $) can hold true only if $r=l$. Moreover, if the condition ($\star $) is satisfied, we have

$$\begin{aligned}&\mathrm {E}[Y_{i_1}\cdots Y_{i_{p+q-2r}}{\mathfrak {p}}_2(Y_{i_{p+q-2r+1}})\cdots {\mathfrak {p}}_2(Y_{i_{p+q-r}})Y_{j_1}\\&\quad \cdots Y_{j_{p+q-2l}}{\mathfrak {p}}_2(Y_{j_{p+q-2l+1}})\cdots {\mathfrak {p}}_2(Y_{j_{p+q-l}})] =v_{i_{p+q-2r+1}}\cdots v_{i_{p+q-r}} \end{aligned}$$

by Lemma 5.2. Since there are totally $(p+q-2r)!$ permutations of $(i_1,\dots ,i_{p+q-2r})$ and r! permutations of $(i_{p+q-2r+1},\dots ,i_{p+q-r})$, respectively, (5.8) yields

$$\begin{aligned}&\mathrm {E}[J_{p+q}(FG)^2]\nonumber \\&\quad =\sum _{r=0}^{p\wedge q}r!^2\left( {\begin{array}{c}p\\ r\end{array}}\right) ^2\left( {\begin{array}{c}q\\ r\end{array}}\right) ^2(p+q-2r)!r!\nonumber \\&\quad \times \sum _{(i_1,\dots ,i_{p+q-r})\in \Delta ^N_{p+q-r}}f\mathbin {\widehat{\star _{r}^0}}g(i_{1},\dots ,i_{p+q-r})^2v_{i_{p+q-2r+1}}\cdots v_{i_{p+q-r}}. \end{aligned}$$

(5.9)

Now, (5.9) is especially true when all the $Y_i$’s follow the standard normal distribution. Therefore, the product formula for multiple integrals with respect to an isonormal Gaussian process yields

$$\begin{aligned} (p+q)!\Vert f\mathbin {{\widetilde{\otimes }}}g\Vert _{\ell _2}^2=&{} \sum _{r=0}^{p\wedge q}r!^2\left( {\begin{array}{c}p\\ r\end{array}}\right) ^2\left( {\begin{array}{c}q\\ r\end{array}}\right) ^2(p+q-2r)!r!\\&\times \sum _{(i_1,\dots ,i_{p+q-r})\in \Delta ^N_{p+q-r}}f\mathbin {\widehat{\star _{r}^0}}g(i_{1},\dots ,i_{p+q-r})^2\cdot 2^r, \end{aligned}$$

where $f\mathbin {{\widetilde{\otimes }}}g$ is the symmetrization of $f\otimes g$. Combining this formula with (5.9), we obtain (5.6).

Next, we prove (5.7). Similar arguments to the above yield

$$\begin{aligned}&\left| \mathrm {E}[J_{2p}(F^2)J_{2p}(G^2)]-(2p)!\langle f\mathbin {{\widetilde{\otimes }}}f,g\mathbin {{\widetilde{\otimes }}}g\rangle \right| \\&\quad \le \left( 2^{-p}\overline{v}_N^p-1\right) \sum _{r=1}^{p}r!^2\left( {\begin{array}{c}p\\ r\end{array}}\right) ^4(2p-2r)!r!\\&\quad \times \sum _{(i_1,\dots ,i_{2p-r})\in \Delta ^N_{2p-r}}\left| f\mathbin {\widehat{\star _{r}^0}}f(i_{1},\dots ,i_{2p-r})g\mathbin {\widehat{\star _{r}^0}}g(i_{1},\dots ,i_{2p-r})\right| \cdot 2^r \end{aligned}$$

and

$$\begin{aligned}&\sum _{r=1}^{p}r!^2\left( {\begin{array}{c}p\\ r\end{array}}\right) ^4(2p-2r)!r!\sum _{(i_1,\dots ,i_{2p-r})\in \Delta ^N_{2p-r}}f\mathbin {\widehat{\star _{r}^0}}f(i_{1},\\&\qquad \dots ,i_{2p-r})g\mathbin {\widehat{\star _{r}^0}}g(i_{1},\dots ,i_{2p-r})\cdot 2^r\\&\quad =(2p)!\left\langle f\mathbin {{\widetilde{\otimes }}}f,g\mathbin {{\widetilde{\otimes }}}g1_{(\Delta ^N_{2p})^c}\right\rangle . \end{aligned}$$

Combining these results with the Schwarz inequality, we infer that

$$\begin{aligned}&\left| \mathrm {E}[J_{2p}(F^2)J_{2p}(G^2)]-(2p)!\langle f\mathbin {{\widetilde{\otimes }}}f,g\mathbin {{\widetilde{\otimes }}}g\rangle \right| \\&\quad \le \left( 2^{-p}\overline{v}_N^p-1\right) \sqrt{(2p)!\Vert f\mathbin {{\widetilde{\otimes }}}f1_{(\Delta ^N_{2p})^c}\Vert ^2_{\ell _2}} \sqrt{(2p)!\Vert g\mathbin {{\widetilde{\otimes }}}g1_{(\Delta ^N_{2p})^c}\Vert ^2_{\ell _2}}. \end{aligned}$$

The desired result now follows from Eq.(50) in [27] and Hölder’s inequality. $\square $

Lemma 5.4

Under the assumptions of Proposition 5.1, we have

$$\begin{aligned} \sum _{k=1}^{p+q-1}\mathrm {E}[J_k(FG)^2]\le {{\,\mathrm{Cov}\,}}[F^2,G^2]-2\mathrm {E}[FG]^2 \end{aligned}$$

(5.10)

and

$$\begin{aligned} \sum _{k=1}^{2p-1}\mathrm {E}[J_k(F^2)^2]+p!^2\sum _{r=1}^{p-1}\left( {\begin{array}{c}p\\ r\end{array}}\right) ^2\Vert f\star _rf\Vert _{\ell _2}^2\le \mathrm {E}[F^4]-3\mathrm {E}[F^2]^2. \end{aligned}$$

(5.11)

Proof

The proof is parallel to that of [59, Lemma 3.1] but using Lemma 5.3 instead of [59, Lemma 2.1]. $\square $

Proof of Proposition 5.2

The nonnegativity of $\kappa _4(F)$ and $\kappa _4(G)$ follows from (5.11). (5.4) can be shown in a similar manner to the proof of [59, Theorem 1.1] but using Lemmas 5.3 and 5.4 instead of Lemmas 2.1 and 3.1 in [59], respectively. $\square $

6.3 Proof of Proposition 5.1

We have already established the nonnegativity of $\kappa _4(Q(f_j;\varvec{Y}))$’s in Proposition 5.2. The remaining claim of the proposition follows once we prove (5.3). By Proposition 5.2 and Lemma A.2, this follows once we show that, under the assumptions of Proposition 5.1,

$$\begin{aligned} \Vert \Gamma (F,G)-\mathrm {E}[\Gamma (F,G)]\Vert _{\psi _{(w_*(p+q)-1)^{-1}}}\le C_{p,q}\underline{\eta }_N^{-(w_*(p+q)-1)}\sqrt{{{\,\mathrm{Var}\,}}[\Gamma (F,G)]},\nonumber \\ \end{aligned}$$

(5.12)

where $C_{p,q}>0$ depends only on p, q. To prove (5.12), note that

$$\begin{aligned} \Gamma (F,G)= & {} \frac{1}{2}\left( {{\,\mathrm{L}\,}}(FG)+qFG+pGF\right) \nonumber \\= & {} \frac{p+q}{2}\mathrm {E}[FG]+\sum _{k=1}^{p+q-1}\frac{p+q-k}{2}J_k(FG). \end{aligned}$$

(5.13)

Hence, using Lemma A.5, we can deduce (5.12) by a hypercontractivity argument similar to those in [33, Section 5] and [41, Section 3.2]. $\square $

7 Randomized Lindeberg Method

For any $\varpi \ge 0$ and $x\ge 0$, we set

$$\begin{aligned} \chi _\varpi (x)= \left\{ \begin{array}{ll} \exp (-x^{1/\varpi }) &{} \text {if }\varpi >0,\\ 1_{[0,1)}(x) &{} \text {if }\varpi =0. \end{array}\right. \end{aligned}$$

The aim of this section is to prove the following result.

Proposition 6.1

Set $\Lambda _i:=(\log d)^{(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}M_N^{q_k-1}\sqrt{{{\,\mathrm{Inf}\,}}_{i}(f_k)}$ for $i\in [N]$. Let $\varvec{X}=(X_i)_{i=1}^N$ and $\varvec{Y}=(Y_i)_{i=1}^N$ be two sequences of independent centered random variables with unit variance. Suppose that $M_N:=\max _{1\le i\le N}(\Vert X_i\Vert _{\psi _\alpha }\vee \Vert Y_i\Vert _{\psi _\alpha })<\infty $ for some $\alpha \in (0,2]$. Suppose also that there is an integer $m\ge 3$ such that $\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]$ for all $i\in [N]$ and $r\in [m-1]$. Then, for any $h\in C^m_b({\mathbb {R}})$, $\beta >0$ and $\tau ,\rho \ge 0$ with $\tau \rho M_N\max _{1\le i\le N}\Lambda _i\le \beta ^{-1}$, we have

$$\begin{aligned}&\sup _{y\in {\mathbb {R}}^d}\left| {{\,\mathrm{E}\,}}\left[ h\left( \Phi _\beta (\varvec{Q}(\varvec{X})-y)\right) \right] -{{\,\mathrm{E}\,}}\left[ h\left( \Phi _\beta (\varvec{Q}(\varvec{Y})-y)\right) \right] \right| \nonumber \\&\quad \le C\left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) \left\{ (\log d)^{m(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}M_N^{mq_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^{m/2}\right. \nonumber \\&\qquad \left. +\left( e^{-\left( \frac{\tau }{K_1}\right) ^{\alpha }} +\chi _{(\overline{q}_d-1)/\alpha }\left( \frac{\rho }{K_2}\right) (\rho \vee 1)^m+\exp \left( -\left( \frac{\tau \rho }{K_3}\right) ^{\alpha /\overline{q}_d}\right) (\tau \rho \vee 1)^m \right) \right. \nonumber \\&\qquad \left. \times M_N^m\sum _{i=1}^N\Lambda _{i}^m\right\} , \end{aligned}$$

(6.1)

where $C>0$ depends only on $m,\alpha ,\overline{q}_d$, $K_1$ depends only on $\alpha $, and $K_2,K_3>0$ depend only on $\alpha ,\overline{q}_d$.

Remark 6.1

Proposition 6.1 can be viewed as a version of [47, Theorem 7.1]. Apart from that we take account of higher moment matching, there are important differences between these two results. On the one hand, the latter takes all $C^3$ functions with bounded third-order partial derivatives as test functions, while the former focuses only on test functions of the form $x\mapsto h(\Phi _\beta (x-y))$ for some $h\in C^m_b({\mathbb {R}})$ and $y\in {\mathbb {R}}^d$. On the other hand, in the bound of (6.1), terms like $\sum _{i=1}^N\max _{1\le k\le d}{{\,\mathrm{Inf}\,}}_i(f_k)$ always appear with exponential factors, so we can remove such terms by appropriately selecting the parameters $\tau ,\rho $. In contrast, such a quantity appears (as the constant C) in the dominant term of the bound given by [47, Theorem 7.1]. As pointed out in Remark 2.4(b), this can be crucial in a high-dimensional setting, and this phenomenon originates from a (naïve) application of the Lindeberg method. To avoid this difficulty, we use a randomized version of the Lindeberg method, which was originally introduced in [24] for sums of independent random vectors.

For the proof, we need three auxiliary results. The first one is a generalization of [36, Lemma S.5.1]:

Lemma 6.1

Let $\xi $ be a nonnegative random variable such that $P(\xi >x)\le Ae^{-(x/B)^\alpha }$ for all $x\ge 0$ and some constants $A,B,\alpha >0$. Then, we have

$$\begin{aligned} {{\,\mathrm{E}\,}}\left[ \xi ^p1_{\{\xi >t\}}\right] \le A\left( 1+\frac{2p-\alpha }{p-\alpha }\right) \left( t\vee \{(2(p/\alpha -1))^{1/\alpha }B\}\right) ^pe^{-(t/B)^\alpha } \end{aligned}$$

for any $p>\alpha $ and $t>0$.

Proof

The proof is analogous to that of [36, Lemma S.5.1] and elementary, so we omit it. $\square $

The second one is a moment inequality for homogeneous sums with a sharp constant:

Lemma 6.2

Let $\varvec{X}=(X_i)_{i=1}^N$ be a sequence of independent centered random variables. Suppose that $M:=\max _{1\le i\le N}\Vert X_i\Vert _{\psi _\alpha }<\infty $ for some $\alpha \in (0,2]$. Also, let $q\in {\mathbb {N}}$ and $f:[N]^q\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals. Then,

$$\begin{aligned} \left\| Q(f;\varvec{X})\right\| _p\le K_{\alpha ,q}p^{q/\alpha }M^q\Vert f\Vert _{\ell _2} \end{aligned}$$

for any $p\ge 2$, where $K_{\alpha ,q}>0$ depends only on $\alpha ,q$.

Since we need additional lemmas to prove Lemma 6.2, we postpone its proof to Appendix B.

The third one is well known and immediately follows from the commutativity of addition, but it will deserve to be explicitly stated for later reference.

Lemma 6.3

Let S be a finite set and $\varphi $ be a real-valued function on S. Also, let $b:S\rightarrow S$ be a bijection. Then, $\sum _{x\in A}\varphi (b(x))=\sum _{x\in b(A)}\varphi (x)$ for any $A\subset S$.

Now we turn to the main body of the proof. Throughout the proof, we will use the standard multi-index notation. For a multi-index $\lambda =(\lambda _1,\dots ,\lambda _d)\in {\mathbb {Z}}_+^d$, we set $|\lambda |:=\lambda _1+\cdots +\lambda _d$, $\lambda !:=\lambda _1!\cdots \lambda _d!$ and $\partial ^\lambda :=\partial _1^{\lambda _1}\cdots \partial _d^{\lambda _d}$ as usual. Also, given a vector $x=(x_1,\dots ,x_d)\in {\mathbb {R}}^d$, we write $x^\lambda =x_1^{\lambda _1}\cdots x_d^{\lambda _d}$.

Proof of Proposition 6.1

Without loss of generality, we may assume $\varvec{X}$ and $\varvec{Y}$ are independent. Throughout the proof, for two real numbers a and b, the notation $a\lesssim b$ means that $a\le cb$ for some constant $c>0$ which depends only on $m,\alpha ,\overline{q}_d$.

Take a vector $y\in {\mathbb {R}}^d$ and define the function $\Psi :{\mathbb {R}}^d\rightarrow {\mathbb {R}}$ by $\Psi (x)=h(\Phi _\beta (x-y))$ for $x\in {\mathbb {R}}^d$. For any $i\in [N]$, $\sigma \in {\mathfrak {S}}_N$ and $k\in [d]$, we define

$$\begin{aligned} \varvec{W}^\sigma _i=(W_{i,1}^\sigma ,\dots ,W_{i,N}^\sigma ):=(X_{\sigma (1)},\dots ,X_{\sigma (i)},Y_{\sigma (i+1)},\dots ,Y_{\sigma (N)}) \end{aligned}$$

and

$$\begin{aligned} U_{k,i}^\sigma&:=\sum _{\begin{array}{c} i_1,\dots ,i_{q_k}=1\\ i_1\ne i,\dots ,i_{q_k}\ne i \end{array}}^N f_k(\sigma (i_1),\dots ,\sigma (i_{q_k}))W^\sigma _{i,i_1}\cdots W^\sigma _{i,i_{q_k}},\\ V_{k,i}^\sigma&:=\sum _{\begin{array}{c} i_1,\dots ,i_{q_k}=1\\ \exists j:i_j= i \end{array}}^N f_k(\sigma (i_1),\dots ,\sigma (i_{q_k}))\prod _{l:i_l\ne i}W_{i,i_l}^\sigma . \end{aligned}$$

Then, we set $\varvec{U}^\sigma _{i}=(U_{k,i}^\sigma )_{k=1}^d$ and $\varvec{V}^\sigma _{i}=(V^\sigma _{k,i})_{k=1}^d$. By construction, $\varvec{U}^\sigma _{i}$ and $\varvec{V}^\sigma _{i}$ are independent of $X_{\sigma (i)}$ and $Y_{\sigma (i)}$. Moreover, we have $Q(f_k;\varvec{W}^\sigma _{i-1})=U^\sigma _{k,i}+Y_{\sigma (i)}V^\sigma _{k,i}$ and $Q(f_k;\varvec{W}^\sigma _{i})=U^\sigma _{k,i}+X_{\sigma (i)}V^\sigma _{k,i}$ (with $\varvec{W}^\sigma _0:=(Y_{\sigma (1)},\dots ,Y_{\sigma (N)})$). In particular, by Lemma 6.3 it holds that $Q(f_k;\varvec{W}^\sigma _{0})=Q(f_k;\varvec{Y})$ and $Q(f_k;\varvec{W}^\sigma _{N})=Q(f_k;\varvec{X})$. Therefore, we obtain

$$\begin{aligned}&\left| {{\,\mathrm{E}\,}}\left[ \Psi (\varvec{Q}(\varvec{X}))\right] -{{\,\mathrm{E}\,}}\left[ \Psi (\varvec{Q}(\varvec{Y}))\right] \right| \nonumber \\&\quad =\frac{1}{N!}\sum _{\sigma \in {\mathcal {S}}_N}\left| {{\,\mathrm{E}\,}}\left[ \Psi (\varvec{Q}(\varvec{W}^\sigma _N))\right] -{{\,\mathrm{E}\,}}\left[ \Psi (\varvec{Q}(\varvec{W}^\sigma _0))\right] \right| \nonumber \\&\quad \le \frac{1}{N!}\sum _{\sigma \in {\mathcal {S}}_N}\sum _{i=1}^N\left| {{\,\mathrm{E}\,}}\left[ \Psi (\varvec{Q}(\varvec{W}^\sigma _i))\right] -{{\,\mathrm{E}\,}}\left[ \Psi (\varvec{Q}(\varvec{W}^\sigma _{i-1}))\right] \right| . \end{aligned}$$

(6.2)

Taylor’s theorem and the independence of $X_{\sigma (i)}$ and $Y_{\sigma (i)}$ from $\varvec{U}_{i}^\sigma $ and $\varvec{V}_{i}^\sigma $ yield

$$\begin{aligned} {{\,\mathrm{E}\,}}\left[ \Psi (\varvec{U}^\sigma _i+\xi \varvec{V}^\sigma _{i})\right] =\sum _{\lambda \in {\mathbb {Z}}_+^d:|\lambda |\le m-1}\frac{1}{\lambda !}{{\,\mathrm{E}\,}}\left[ \partial ^\lambda \Psi (\varvec{U}^\sigma _i)\left( \varvec{V}_i^\sigma \right) ^\lambda \right] {{\,\mathrm{E}\,}}\left[ \xi ^{|\lambda |}\right] +R^\sigma _i[\xi ] \end{aligned}$$

for $\xi \in \{X_{\sigma (i)},Y_{\sigma (i)}\}$, where

$$\begin{aligned} R^\sigma _i[\xi ]:=\sum _{\lambda \in {\mathbb {Z}}_+^d:|\lambda |=m}\frac{m}{\lambda !}\int _0^1(1-t)^{m-1}{{\,\mathrm{E}\,}}\left[ \partial ^\lambda \Psi (\varvec{U}^\sigma _i+t\xi \varvec{V}^\sigma _i)\xi ^{m}(\varvec{V}^\sigma _{i})^\lambda \right] dt. \end{aligned}$$

Since $\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]$ for all $i\in [N]$ and $r\in [m-1]$ by assumption, we obtain

$$\begin{aligned} \left| {{\,\mathrm{E}\,}}\left[ \Psi \left( \varvec{Q}^\sigma _{i})\right] -{{\,\mathrm{E}\,}}\left[ \Psi (\varvec{Q}^\sigma _{i-1}\right) \right] \right| \le |R^\sigma _i[X_{\sigma (i)}]|+|R^\sigma _i[Y_{\sigma (i)}]| \le {\mathbf {I}}_i^\sigma +\mathbf {II}_i^\sigma , \end{aligned}$$

(6.3)

where ${\mathbf {I}}_i^\sigma :={\mathbf {I}}_i^\sigma [X_{\sigma (i)}]+{\mathbf {I}}_i^\sigma [Y_{\sigma (i)}]$, $\mathbf {II}_i^\sigma :=\mathbf {II}_i^\sigma [X_{\sigma (i)}]+\mathbf {II}_i^\sigma [Y_{\sigma (i)}]$ and

$$\begin{aligned} {\mathbf {I}}_i^\sigma [\xi ]&:=\sum _{\lambda \in {\mathbb {Z}}_+^d:|\lambda |=m}\frac{m}{\lambda !}\int _0^1(1-t)^{m-1}{{\,\mathrm{E}\,}}\left[ |\partial ^\lambda \Psi (\varvec{U}^\sigma _i+t\xi \varvec{V}^\sigma _i)||\xi |^{m}|(\varvec{V}^\sigma _{i})^\lambda |;{\mathcal {E}}_{\sigma ,i}\right] dt,\\ \mathbf {II}_i^\sigma [\xi ]&:=\sum _{\lambda \in {\mathbb {Z}}_+^d:|\lambda |=m}\frac{m}{\lambda !}\int _0^1(1-t)^{m-1}{{\,\mathrm{E}\,}}\left[ |\partial ^\lambda \Psi (\varvec{U}^\sigma _i+t\xi \varvec{V}^\sigma _i)||\xi |^{m}|(\varvec{V}^\sigma _{i})^\lambda |;{\mathcal {E}}_{\sigma ,i}^c\right] dt \end{aligned}$$

for $\xi \in \{X_{\sigma (i)},Y_{\sigma (i)}\}$ and ${\mathcal {E}}_{\sigma ,i}:=\{(|X_{\sigma (i)}|+|Y_{\sigma (i)}|)\Vert \varvec{V}^\sigma _i\Vert _{\ell _\infty }\le \tau \rho M_N\Lambda _{\sigma (i)}\}$.

First, we consider ${\mathbf {I}}_i^\sigma $. Since $\tau \rho M_N\max _{1\le i\le N}\Lambda _i\le \beta ^{-1}$ by assumption, Lemma 3.1 and the independence of $X_{\sigma (i)},Y_{\sigma (i)}$ from $\varvec{U}^\sigma _i,\varvec{V}^\sigma _i$ imply that

$$\begin{aligned} {\mathbf {I}}_i^\sigma&\le \frac{e^8}{m!}\sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}(\varvec{U}^\sigma _i-y)(|X_{\sigma (i)}|^{m}+|Y_{\sigma (i)}|^{m})|V^\sigma _{j_1,i}|^m\right] \nonumber \\&\le \frac{e^8}{m!}\sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}(\varvec{U}^\sigma _i-y)|V^\sigma _{j_1,i}|^m\right] {{\,\mathrm{E}\,}}\left[ |X_{\sigma (i)}|^{m}+|Y_{\sigma (i)}|^{m}\right] \nonumber \\&\le \frac{e^8}{m!}\sup _{1\le i\le N}\mathrm {E}[|X_{i}|^{m}+|Y_{i}|^{m}]\left\{ {\mathbf {I}}(1)_i^\sigma +{\mathbf {I}}(2)_i^\sigma +{\mathbf {I}}(3)_i^\sigma \right\} , \end{aligned}$$

(6.4)

where

$$\begin{aligned} {\mathbf {I}}(1)_i^\sigma&:=\sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}(\varvec{U}^\sigma _i-y)|V^\sigma _{j_1,i}|^m;{\mathcal {C}}_{\sigma ,i}\cap {\mathcal {D}}_{\sigma ,i}\right] ,\\ {\mathbf {I}}(2)_i^\sigma&:=\sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}(\varvec{U}^\sigma _i-y)|V^\sigma _{j_1,i}|^m;{\mathcal {C}}_{\sigma ,i}^c\right] ,\\ {\mathbf {I}}(3)_i^\sigma&:=\sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}(\varvec{U}^\sigma _i-y)|V^\sigma _{j_1,i}|^m;{\mathcal {D}}_{\sigma ,i}^c\right] \end{aligned}$$

and ${\mathcal {C}}_{\sigma ,i}:=\{|X_{\sigma (i)}|+|Y_{\sigma (i)}|\le \tau M_N\},{\mathcal {D}}_{\sigma ,i}:=\{\Vert \varvec{V}^\sigma _i\Vert _{\ell _\infty }\le \rho \Lambda _{\sigma (i)}\}$.

We begin by estimating ${\mathbf {I}}(1)_i^\sigma $. Let $(\delta _i)_{i=1}^N$ be a sequence of i.i.d. Bernoulli variables independent of $\varvec{X}$ and $\varvec{Y}$ with $P(\delta _i=1)=1-P(\delta _i=0)=i/(N+1)$. We set $\zeta _{i,a}:=\delta _i X_a+(1-\delta _i)Y_a$ for all $i,a\in [N]$. Then, since $\Vert \zeta _{i,\sigma (i)}\varvec{V}^\sigma _i\Vert _{\ell _\infty }\le \tau \rho M_N\max _{1\le i\le N}\Lambda _i\le \beta ^{-1}$ on the set ${\mathcal {C}}_{\sigma ,i}\cap {\mathcal {D}}_{\sigma ,i}$, by Lemma 3.1 we obtain

$$\begin{aligned} {\mathbf {I}}(1)_i^\sigma \le e^8\sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}(\varvec{U}^\sigma _i+\zeta _{i,\sigma (i)}\varvec{V}^\sigma _i-y)|V^\sigma _{j_1,i}|^m\right] . \end{aligned}$$

The subsequent discussions are inspired by the proof of [24, Lemma 2] and we introduce some notation analogous to theirs. For any $i,a\in [N]$, we set

$$\begin{aligned} {\mathcal {A}}_{i,a}=\{(A,B):A\subset [N],B\subset [N],A\cup B=[N]\setminus \{a\},\#A=i-1,\#B=N-i\}, \end{aligned}$$

where $\#S$ denotes the number of elements in a set S. We also set

$$\begin{aligned} {\mathcal {A}}_{i}=\{(A,B):A\subset [N],B\subset [N],A\cup B=[N],\#A=i,\#B=N-i\} \end{aligned}$$

for every $i\in \{0,1\dots ,N\}$. Moreover, for any $A,B\subset [N]$ with $A\cap B=\emptyset $ and $i\in A\cup B$, we define the random variable $W^{(A,B)}_i$ by $W^{(A,B)}_i:=X_i$ if $i\in A$ and $W^{(A,B)}_i:=Y_i$ if $i\in B$. Then, we define

$$\begin{aligned} Q_k^{(A,B)}:=\sum _{i_1,\dots ,i_{q_k}=1}^Nf_k(i_1,\dots ,i_{q_k})W^{(A,B)}_{i_1}\cdots W^{(A,B)}_{i_{q_k}} \end{aligned}$$

for any $k\in [d]$ and $(A,B)\in \bigcup _{i=0}^\infty \mathcal {A}_i$, and set $\varvec{Q}^{(A,B)}:=(Q_k^{(A,B)})_{k=1}^d$. We also define

$$\begin{aligned} U_{k,a}^{(A,B)}&:=\sum _{\begin{array}{c} i_1,\dots ,i_{q_k}=1\\ i_1\ne a,\dots ,i_{q_k}\ne a \end{array}}^N f_k(i_1,\dots ,i_{q_k})W^{(A,B)}_{i_1}\cdots W^{(A,B)}_{i_{q_k}},\\ V_{k,a}^{(A,B)}&:=\sum _{\begin{array}{c} i_1,\dots ,i_{q_k}=1\\ \exists j:i_j= a \end{array}}^N f_k(i_1,\dots ,i_{q_k})\prod _{l:i_l\ne a}W^{(A,B)}_{i_l} \end{aligned}$$

for any $k\in [d]$, $i,a\in [N]$ and $(A,B)\in \bigcup _{j=1}^N{\mathcal {A}}_{j,a}$ and set $\varvec{U}_a^{(A,B)}:=(U_{k,a}^{(A,B)})_{k=1}^d$ and $\varvec{V}_a^{(A,B)}:=(V_{k,a}^{(A,B)})_{k=1}^d$. Finally, for any $\sigma \in {\mathfrak {S}}_N$ and $i\in [N]$ we set $A^\sigma _i:=\{\sigma (1),\dots ,\sigma (i-1)\}$ and $B^\sigma _i:=\{\sigma (i+1),\dots ,\sigma (N)\}$.

Now, since we have $W^\sigma _{i,j}=W^{(A^\sigma _i,B^\sigma _i)}_{\sigma (j)}$ for $j\in [N]\setminus \{i\}$, it holds that $\varvec{U}^\sigma _{i}=\varvec{U}^{(A^\sigma _i,B^\sigma _i)}_{\sigma (i)}$ and $\varvec{V}^\sigma _{i}=\varvec{V}^{(A^\sigma _i,B^\sigma _i)}_{\sigma (i)}$ by Lemma 6.3. Therefore, we obtain

$$\begin{aligned}&\frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{\mathbf {I}}(1)_i^\sigma \\ {}&\quad \le \frac{e^8}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N\sum _{j_1,\dots ,j_m=1}^d\\ {}&\qquad {{\,\mathrm {E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{U}^{(A^\sigma _i,B^\sigma _i)}_{\sigma (i)}+\zeta _{i,\sigma (i)}\varvec{V}^{(A^\sigma _i,B^\sigma _i)}_{\sigma (i)}-y\right) \left| V^{(A^\sigma _i,B^\sigma _i)}_{j_1,\sigma (i)}\right| ^m\right] \\ {}&\quad =\frac{e^8}{N!}\sum _{i=1}^N\sum _{a=1}^N\sum _{\sigma \in {\mathfrak {S}}_N:\sigma (i)=a}\sum _{j_1,\dots ,j_m=1}^d {{\,\mathrm {E}\,}} \left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{U}^{(A^\sigma _i,B^\sigma _i)}_{a}\right. \right. \\ {}&\left. \left. \qquad +\zeta _{i,a}\varvec{V}^{(A^\sigma _i,B^\sigma _i)}_{a}-y\right) \left| V^{(A^\sigma _i,B^\sigma _i)}_{j_1,a}\right| ^m\right] \\ {}&\quad =\frac{e^8}{N!}\sum _{i=1}^N\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i,a}}\#\{\sigma \in {\mathfrak {S}}_N:A^\sigma _i=A,\sigma (i)=a\}\\ {}&\quad \times \sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm {E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{U}^{(A,B)}_{a}+\zeta _{i,a}\varvec{V}^{(A,B)}_{a}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] \\ {}&\quad =e^8\sum _{i=1}^N\frac{(i-1)!(N-i)!}{N!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i,a}}\sum _{j_1,\dots ,j_m=1}^d\\ {}&\quad \qquad {{\,\mathrm {E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{U}^{(A,B)}_{a}+\zeta _{i,a}\varvec{V}^{(A,B)}_{a}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] . \end{aligned}$$

Now, for $(A,B)\in {\mathcal {A}}_{i,a}$ we have $\varvec{U}^{(A,B)}_{a}+X_a\varvec{V}^{(A,B)}_{a}=\varvec{Q}^{(A\cup \{a\},B)}$ and $\varvec{U}^{(A,B)}_{a}+Y_a\varvec{V}^{(A,B)}_{a}=\varvec{Q}^{(A,B\cup \{a\})}$, so we obtain

$$\begin{aligned}&\frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{\mathbf {I}}(1)_i^\sigma \\&\quad \le e^8\sum _{i=1}^N\frac{(i-1)!(N-i)!}{N!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i,a}}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad \left\{ \frac{i}{N+1}{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A\cup \{a\},B)}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] \right. \\&\quad \quad \left. +\frac{N+1-i}{N+1}{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A,B\cup \{a\})}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] \right\} \\&\quad =e^8\sum _{i=1}^N\frac{i!(N-i)!}{(N+1)!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i,a}}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A\cup \{a\},B)}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] \\&\qquad +e^8\sum _{i=1}^N\frac{(i-1)!(N+1-i)!}{(N+1)!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i,a}}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A,B\cup \{a\})}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] \\&\quad =e^8\sum _{i=1}^N\frac{i!(N-i)!}{(N+1)!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i,a}}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A\cup \{a\},B)}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] \\&\qquad +e^8\sum _{i=0}^{N-1}\frac{i!(N-i)!}{(N+1)!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i+1,a}}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A,B\cup \{a\})}-y\right) \left| V^{(A,B)}_{j_1,a}\right| ^m\right] \\&\quad =e^8\sum _{i=1}^N\frac{i!(N-i)!}{(N+1)!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i}:a\in A}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A,B)}-y\right) \left| V^{(A\setminus \{a\},B)}_{j_1,a}\right| ^m\right] \\&\qquad +e^8\sum _{i=0}^{N-1}\frac{i!(N-i)!}{(N+1)!}\sum _{a=1}^N\sum _{(A,B)\in {\mathcal {A}}_{i}:a\in B}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A,B)}-y\right) \left| V^{(A,B\setminus \{a\})}_{j_1,a}\right| ^m\right] \\&\quad =e^8\sum _{i=0}^N\frac{i!(N-i)!}{(N+1)!}\sum _{(A,B)\in {\mathcal {A}}_{i}}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A,B)}-y\right) \sum _{a=1}^N\left| V^{(A\setminus \{a\},B\setminus \{a\})}_{j_1,a}\right| ^m\right] \\&\quad \le e^8\sum _{i=0}^N\frac{i!(N-i)!}{(N+1)!}\sum _{(A,B)\in {\mathcal {A}}_{i}}\sum _{j_1,\dots ,j_m=1}^d\\&\qquad {{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}\left( \varvec{Q}^{(A,B)}-y\right) \max _{1\le k\le d}\sum _{a=1}^N\left| V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}\right| ^m\right] . \end{aligned}$$

Hence, Lemma 3.1 yields

$$\begin{aligned} \frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{\mathbf {I}}(1)_i^\sigma&\lesssim \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \sum _{i=0}^N\frac{i!(N-i)!}{(N+1)!}\sum _{(A,B)\in {\mathcal {A}}_{i}}\\&\quad {{\,\mathrm{E}\,}}\left[ \max _{1\le k\le d}\sum _{a=1}^N\left| V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}\right| ^m\right] . \end{aligned}$$

Now, by Lemma 6.2 we have

$$\begin{aligned} \left\| V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}\right\| _{r}\le C_{\alpha ,\overline{q}_d}r^{(q_k-1)/\alpha }M_N^{q_k-1}\sqrt{{{\,\mathrm{Inf}\,}}_a(f_k)} \end{aligned}$$

(6.5)

for any $r\ge 1$, where $C_{\alpha ,\overline{q}_d}>0$ depends only on $\alpha ,\overline{q}_d$. Hence, the Minkowski inequality yields

$$\begin{aligned} \left\| \sum _{a=1}^N\left| V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}\right| ^m\right\| _r \le C_{\alpha ,\overline{q}_d}(mr)^{m(q_k-1)/\alpha }M_N^{m(q_k-1)}\sum _{a=1}^N{{\,\mathrm{Inf}\,}}_a(f_k)^{m/2}.\quad \end{aligned}$$

(6.6)

Thus, if $\overline{q}_d>1$, Lemma A.5 yields

$$\begin{aligned}&\left\| \sum _{a=1}^N\left| V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}\right| ^m\right\| _{\psi _{\alpha /\{m(\overline{q}_d-1)\}}}\\&\quad \lesssim M_N^{m(q_k-1)}\sum _{a=1}^N{{\,\mathrm{Inf}\,}}_a(f_k)^{m/2}. \end{aligned}$$

Therefore, by Lemmas A.2 and A.6 we conclude that

$$\begin{aligned} {{\,\mathrm{E}\,}}\left[ \max _{1\le k\le d}\sum _{a=1}^N\left| V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}\right| ^m\right] \lesssim (\log d)^{m(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}M_N^{m(q_k-1)}\sum _{a=1}^N{{\,\mathrm{Inf}\,}}_a(f_k)^{m/2}. \end{aligned}$$

This inequality also holds true when $\overline{q}_d=1$ because in this case $V^{(A\setminus \{a\},B\setminus \{a\})}_{k,a}$’s are non-random and thus it is a direct consequence of (6.6). As a result, we obtain

$$\begin{aligned} \frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{\mathbf {I}}(1)_i^\sigma \lesssim&{} \left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) \nonumber \\ \times&\left( (\log d)^{m(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}M_N^{m(q_k-1)}\sum _{a=1}^N{{\,\mathrm {Inf}\,}}_a(f_k)^{m/2}\right) .\nonumber \\ \end{aligned}$$

(6.7)

Next, we estimate ${\mathbf {I}}(2)^\sigma _i$. Since $X_{\sigma (i)}$ and $Y_{\sigma (i)}$ are independent of $\varvec{U}^\sigma _i$ and $\varvec{V}^\sigma _i$,

$$\begin{aligned} {\mathbf {I}}(2)_i^\sigma \le \sum _{j_1,\dots ,j_m=1}^d{{\,\mathrm{E}\,}}\left[ \Upsilon _\beta ^{j_1,\dots , j_m}(\varvec{U}^\sigma _i-y)\Vert V^\sigma _{i}\Vert _{\ell _\infty }^m\right] P\left( {\mathcal {C}}_{\sigma ,i}^c\right) . \end{aligned}$$

Hence, Lemma 3.1 yields

$$\begin{aligned} {\mathbf {I}}(2)_i^\sigma \lesssim \left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) {{\,\mathrm{E}\,}}\left[ \Vert V^\sigma _{i}\Vert _{\ell _\infty }^m\right] P\left( {\mathcal {C}}_{\sigma ,i}^c\right) . \end{aligned}$$

Now, if $\overline{q}_d>1$, (6.5) and Lemma A.5 yield

$$\begin{aligned} \Vert V^\sigma _{k,i}\Vert _{\psi _{\alpha /(\overline{q}_d-1)}}\le c_{\alpha ,\overline{q}_d}M_N^{q_k-1}\sqrt{{{\,\mathrm{Inf}\,}}_{\sigma (i)}(f_k)}, \end{aligned}$$

where $c_{\alpha ,\overline{q}_d}>0$ depends only on $\alpha ,\overline{q}_d$. Hence, Lemmas A.2 and A.6 yield

$$\begin{aligned} \Vert \Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }\Vert _{r}\le c'_{\alpha ,\overline{q}_d}r^{(\overline{q}_d-1)/\alpha }\Lambda _{\sigma (i)} \end{aligned}$$

(6.8)

for every $r\ge 1$ with $c'_{\alpha ,\overline{q}_d}>0$ depending only on $\alpha ,\overline{q}_d$. This inequality also holds true when $\overline{q}_d=1$ because in this case $\varvec{V}^\sigma _{i}$ is non-random and thus it is a direct consequence of (6.5). Meanwhile, (A.3) and Lemma A.3 yield $ P({\mathcal {C}}_{\sigma ,i}^c)\le 2e^{-(\tau /2^{1\vee \alpha ^{-1}})^{\alpha }}. $ Consequently, we obtain

$$\begin{aligned} \frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{\mathbf {I}}(2)_i^\sigma \lesssim \left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) e^{-(\tau /2^{1\vee \alpha ^{-1}})^{\alpha }}\sum _{i=1}^N\Lambda _i^m. \end{aligned}$$

(6.9)

Third, we estimate ${\mathbf {I}}(3)_i^\sigma $. Lemma 3.1 yields

$$\begin{aligned} {\mathbf {I}}(3)_i^\sigma \lesssim \left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) {{\,\mathrm{E}\,}}\left[ \Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }^m;{\mathcal {D}}_{\sigma ,i}^c\right] . \end{aligned}$$

If $\overline{q}_d>1$, (6.8) and Lemma A.4 yield

$$\begin{aligned} P\left( \Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }\ge x\right) \le e^{(\overline{q}_d-1)/\alpha }\exp \left( -\left( \frac{x}{K_{\alpha ,\overline{q}_d}\Lambda _{\sigma (i)}}\right) ^{\alpha /(\overline{q}_d-1)}\right) \end{aligned}$$

for every $x>0$ with $K_{\alpha ,\overline{q}_d}>0$ depending only on $\alpha ,\overline{q}_d$. Hence, Lemma 6.1 yields

$$\begin{aligned} {{\,\mathrm{E}\,}}\left[ \Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }^m;{\mathcal {D}}_{\sigma ,i}^c\right] \lesssim (\rho \vee 1)^m\Lambda _{\sigma (i)}^m\exp \left( -\left( \frac{\rho }{K_{\alpha ,\overline{q}_d}}\right) ^{\alpha /(\overline{q}_d-1)}\right) . \end{aligned}$$

Meanwhile, if $\overline{q}_d=1$, $\varvec{V}^\sigma _{i}$ is non-random, so (6.8) yields $ {{\,\mathrm{E}\,}}\left[ \Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }^m;{\mathcal {D}}_{\sigma ,i}^c\right] \lesssim \Lambda _{\sigma (i)}^m1_{\{c'_{\alpha ,\overline{q}_d}>\rho \}}. $ Consequently, setting $K_{\alpha ,\overline{q}_d}':=K_{\alpha ,\overline{q}_d}\vee c'_{\alpha ,\overline{q}_d}$, we obtain

$$\begin{aligned}&\frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{\mathbf {I}}(3)_i^\sigma \nonumber \\&\quad \lesssim \left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) \chi _{(\overline{q}_d-1)/\alpha }\left( \frac{\rho }{K'_{\alpha ,\overline{q}_d}}\right) (\rho \vee 1)^m\sum _{i=1}^N\Lambda _{i}^m. \end{aligned}$$

(6.10)

Now, combining (6.4), (6.7), (6.9), (6.10) with Lemma A.6, we obtain

$$\begin{aligned}&\frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{\mathbf {I}}_i^\sigma \nonumber \\&\quad \lesssim \left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) \left\{ (\log d)^{m(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}M_N^{mq_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^{m/2}\right. \nonumber \\&\qquad \left. +e^{-(\tau /2^{1\vee \alpha ^{-1}})^{\alpha }}M_N^{m}\sum _{i=1}^N\Lambda _i^m +\chi _{(\overline{q}_d-1)/\alpha }\left( \frac{\rho }{K'_{\alpha ,\overline{q}_d}}\right) (\rho \vee 1)^mM_N^{m}\sum _{i=1}^N\Lambda _{i}^m\right\} .\nonumber \\ \end{aligned}$$

(6.11)

Next, we consider $\mathbf {II}_i^\sigma $. Lemma 3.1 yields

$$\begin{aligned} \frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N\mathbf {II}_i^\sigma&\lesssim \frac{1}{N!}\left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) \\&\quad \times \sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N{{\,\mathrm {E}\,}}\left[ (|X_{\sigma (i)}|^{m}+|Y_{\sigma (i)}|^{m})\max _{1\le k\le d}\left| V^\sigma _{k,i}\right| ^m;{\mathcal {E}}_{\sigma ,i}^c\right] . \end{aligned}$$

Since $X_{\sigma (i)}$ and $Y_{\sigma (i)}$ are independent of $\varvec{V}^\sigma _{i}$, Lemma A.6 and (6.8) imply that

$$\begin{aligned} \Vert (|X_{\sigma (i)}|+|Y_{\sigma (i)}|)\Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }\Vert _{r}\le L_{\alpha ,\overline{q}_d}r^{\overline{q}_d/\alpha }M_N\Lambda _{\sigma (i)} \end{aligned}$$

for every $r\ge 1$ with $L_{\alpha ,\overline{q}_d}>0$ depending only on $\alpha ,\overline{q}_d$. Thus, by Lemma A.4 we obtain

$$\begin{aligned} P\left( (|X_{\sigma (i)}|+|Y_{\sigma (i)}|)\Vert \varvec{V}^\sigma _{i}\Vert _{\ell _\infty }\ge x\right) \le e^{\overline{q}_d/\alpha }\cdot \exp \left( -\left( \frac{x}{L'_{\alpha ,\overline{q}_d}M_N\Lambda _{\sigma (i)}}\right) ^{\alpha /\overline{q}_d}\right) \end{aligned}$$

for every $x>0$ with $L'_{\alpha ,\overline{q}_d}>0$ depending only on $\alpha ,\overline{q}_d$. So Lemma 6.1 yields

$$\begin{aligned}&\frac{1}{N!}\sum _{\sigma \in {\mathfrak {S}}_N}\sum _{i=1}^N\mathbf {II}_i^\sigma \\&\quad \lesssim \left( \max _{1\le j\le m}\beta ^{m-j}\Vert h^{(j)}\Vert _\infty \right) \sum _{i=1}^N\left( \tau \rho \vee 1\right) ^mM_N^m\Lambda _{i}^m \exp \left( -\left( \frac{\tau \rho }{L'_{\alpha ,\overline{q}_d}}\right) ^{\alpha /\overline{q}_d}\right) . \end{aligned}$$

Combining this inequality with (6.2), (6.3) and (6.11), we complete the proof. $\square $

8 Proof of the Main Results

8.1 Proof of Theorem 2.1

The following result is a version of [47, Lemma 4.3]. The proof is a minor modification of the latter’s, so we omit it.

Lemma 7.1

Let $q\in {\mathbb {N}}$ and $f:[N]^q\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals. Also, let $\varvec{X}=(X_i)_{i=1}^N$ and $\varvec{Y}=(Y_i)_{i=1}^N$ be two sequences of independent centered random variables with unit variance. Suppose that there are integers $3\le m\le l$ such that $M_N:=\max _{1\le i\le N}(\Vert X_i\Vert _{l}\vee \Vert Y_i\Vert _{l})<\infty $ and $\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]$ for all $i\in [N]$ and $r\in [m-1]$. Then, we have $Q(f;\varvec{X}),Q(f;\varvec{Y})\in L^l(P)$ and

$$\begin{aligned}&|\mathrm {E}[Q(f;\varvec{X})^l]-\mathrm {E}[Q(f;\varvec{Y})^l]|\\&\quad \le CM_N^{ql}(1\vee \Vert f\Vert _{\ell _2})^{l-m}\sum _{i=1}^N\max \{{{\,\mathrm{Inf}\,}}_i(f)^{\frac{m}{2}},{{\,\mathrm{Inf}\,}}_i(f)^{\frac{l}{2}}\}, \end{aligned}$$

where $C>0$ depends only on q, l.

Proof of Theorem 2.1

Throughout the proof, for two real numbers a and b, the notation $a\lesssim b$ means that $a\le cb$ for some constant $c>0$ which depends only on $\alpha ,\overline{q}_d$. Moreover, if $(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}\ge 1$, then the claim evidently holds true with $C=1$, so we may assume $(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}<1$.

Set $s_i:=\mathrm {E}[X_i^3]$ for every i. We take a sequence $\varvec{Y}=(Y_i)_{i=1}^N$ of independent random variables such that

$$\begin{aligned} Y_i\sim \left\{ \begin{array}{cl} {\mathcal {N}}(0,1) &{} \text {if }s_i=0, \\ \gamma _+(4/s_i^2) &{} \text {if }s_i>0, \\ \gamma _-(4/s_i^2) &{} \text {if }s_i<0. \end{array} \right. \end{aligned}$$

By construction, we have $\mathrm {E}[X_i^r]=\mathrm {E}[Y_i^r]$ for any $i\in [N]$ and $r\in [3]$. Moreover, one can easily check that $\Vert Y_i\Vert _r\le \overline{B}_N(r-1)^w$ for any $i\in [N]$ and $r\ge 2$. Hence, by Lemma A.5 we have $\max _{1\le i\le N}\Vert Y_i\Vert _{\psi _\alpha }\le c_\alpha \overline{B}_N$ with $c_\alpha \ge 1$ depending only on $\alpha $. Therefore, applying Proposition 6.1 with $m=4$, we obtain

$$\begin{aligned}&\Delta _\varepsilon (\varvec{Q}(\varvec{X}),\varvec{Q}(\varvec{Y}))\\&\quad \le C_1\varepsilon ^{-4}(\log d)^3 \left\{ (\log d)^{4(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}\overline{B}_N^{4q_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^{2} \right. \\&\qquad +\left( e^{-\left( \frac{\tau }{K_1}\right) ^{\alpha }} +\chi _{(\overline{q}_d-1)/\alpha }\left( \frac{\rho }{K_2}\right) (\rho \vee 1)^4\right. \\&\left. \left. \qquad +\exp \left( -\left( \frac{\tau \rho }{K_3\overline{B}_N}\right) ^{\alpha /\overline{q}_d}\right) (\tau \rho \vee 1)^4 \right) \overline{B}_N^4\sum _{i=1}^N\Lambda _{i}^4 \right\} \\&\quad =:C_1\varepsilon ^{-4}(\log d)^3\left( {\mathbb {I}}+\mathbb {II}\right) \end{aligned}$$

for any $\varepsilon >0$ and $\tau ,\rho \ge 0$ with $\tau \rho c_\alpha \overline{B}_N\max _{1\le i\le N}\Lambda _i\le \varepsilon /\log d$, where $C_1,K_1,K_2,K_3>0$ depend only on $\alpha ,\overline{q}_d$, and $\Lambda _i:=(\log d)^{(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}\overline{B}_N^{q_k-1}\sqrt{{{\,\mathrm{Inf}\,}}_{i}(f_k)}$. We apply this inequality with $\tau :=(\log d^2)^{1/\alpha }\{K_1\vee (K_3/K_2)\}$, $\rho :=(\log d^2)^{(\overline{q}_d-1)/\alpha }K_2$ and

$$\begin{aligned} \varepsilon :=(\log d)^{1/6}\delta _0[\varvec{Q}(\varvec{X})]^{1/3}+(\log d)^{\mu }\delta _1[\varvec{Q}(\varvec{X})]^{1/3}+(\log d)\tau \rho c_\alpha \overline{B}_N\max _{1\le i\le N}\Lambda _i. \end{aligned}$$

By construction, we have

$$\begin{aligned} \mathbb {II}&\lesssim d^{-1}\sum _{i=1}^N\sum _{k=1}^d\overline{B}_N^{4q_k}{{\,\mathrm{Inf}\,}}_{i}(f_k)^2 \le \max _{1\le k\le d}\overline{B}_N^{4q_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_{i}(f_k)^2. \end{aligned}$$

Therefore, we obtain

$$\begin{aligned} \Delta _\varepsilon (\varvec{Q}(\varvec{X}),\varvec{Q}(\varvec{Y}))&\lesssim \varepsilon ^{-4}(\log d)^{3+4(\overline{q}_d-1)/\alpha }\max _{1\le k\le d}\overline{B}_N^{4q_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2\\&\lesssim \varepsilon ^{-4}(\log d)^{3+4(\overline{q}_d-1)/\alpha }\delta _1[\varvec{Q}(\varvec{X})]^2 \\&\le (\log d)^{3+4\{(\overline{q}_d-1)/\alpha -\mu \}}\delta _1[\varvec{Q}(\varvec{X})]^{2/3}. \end{aligned}$$

Since $3+4\{(\overline{q}_d-1)/\alpha -\mu \}\le \frac{4}{3\alpha }(\overline{q}_d-1)+\frac{5}{3}\le 2\mu +1$ and $(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}<1$, we conclude that

$$\begin{aligned} \Delta _\varepsilon (\varvec{Q}(\varvec{X}),\varvec{Q}(\varvec{Y}))\lesssim (\log d)^{2\mu +1}\delta _1[\varvec{Q}(\varvec{X})]^{2/3} \le (\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{\frac{1}{3}}. \end{aligned}$$

(7.1)

Meanwhile, Proposition 5.1 yields

$$\begin{aligned} \Delta _{\varepsilon }(\varvec{Q}(\varvec{Y}),Z)&\lesssim \varepsilon ^{-2}(\log d)\left( \delta _0[\varvec{Q}(\varvec{Y})]+\delta _2[\varvec{Q}(\varvec{Y})]\right) . \end{aligned}$$

Now, in the present situation, the constants $w_*$, $\overline{v}_N$ and $\overline{\eta }_N$ appearing in Proposition 5.1 satisfy $w_*=w$, $\overline{v}_N\le 2+\overline{A}_N^2/2$ and $\overline{\eta }_N^{-1}\le \overline{A}_N/2$, so we have

$$\begin{aligned}&\delta _2[\varvec{Q}(\varvec{Y})]\\&\quad \le (\overline{A}_N/2)^{2w\overline{q}_d-1}(\log d)^{2w\overline{q}_d-1} \max _{1\le j,k\le d}\left\{ 1_{\{q_j< q_k\}}\Vert Q(f_j;\varvec{Y})\Vert _4\kappa _4(Q(f_k;\varvec{Y}))^{1/4}\right. \\&\qquad \left. +1_{\{q_j=q_k\}}\sqrt{2\kappa _4(Q(f_k;\varvec{Y}))+\left( (1+\overline{A}_N^2/4)^{q_k}-1\right) (2q_k)!\mathfrak {c}_{q_k} \sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2}\right\} . \end{aligned}$$

Moreover, by a standard hypercontractivity argument, we have $\Vert Q(f_j;\varvec{Y})\Vert _4\lesssim \overline{A}_N^{q_j}\Vert Q(f_j;\varvec{Y})\Vert _2$ for every j. Also, since Lemma A.5 yields $\max _{1\le i\le N}\Vert Y_i\Vert _{\psi _\alpha }\lesssim \underline{\eta }_N^{-1}$, by Lemma 7.1 (with $l=m=4$) we obtain

$$\begin{aligned} |\mathrm {E}[Q(f_k;\varvec{X})^4]-\mathrm {E}[Q(f_k;\varvec{Y})^4]| \lesssim \overline{A}_N^{4q_k}\sum _{i=1}^N{{\,\mathrm{Inf}\,}}_i(f_k)^2 \end{aligned}$$

for every k. Since we have $\Vert Q(f_k;\varvec{Y})\Vert _2=\sqrt{q_k!}\Vert f_k\Vert _{\ell _2}=\Vert Q(f_k;\varvec{X})\Vert _2$ for every k, it holds that $ \delta _2[\varvec{Q}(\varvec{Y})]\lesssim (\log d)^{2w\overline{q}_d-1}\delta _1[\varvec{Q}(\varvec{X})]. $ Consequently, we obtain

$$\begin{aligned} \Delta _{\varepsilon }(\varvec{Q}(\varvec{Y}),Z)&\lesssim (\log d)^{2/3}\delta _0[\varvec{Q}(\varvec{X})]^{1/3}+(\log d)^{2(w\overline{q}_d-\mu )}\delta _1[\varvec{Q}(\varvec{X})]^{1/3}. \end{aligned}$$

Since $2(w\overline{q}_d-\mu )\le \frac{2}{3}w\overline{q}_d+\frac{1}{3}\le \mu +\frac{1}{2}$, we conclude that

$$\begin{aligned} \Delta _{\varepsilon }(\varvec{Q}(\varvec{Y}),Z)\lesssim (\log d)^{2/3}\delta _0[\varvec{Q}(\varvec{X})]^{1/3}+(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{1/3}. \end{aligned}$$

(7.2)

Now, (7.1)–(7.2) imply that

$$\begin{aligned} \Delta _{\varepsilon }(\varvec{Q}(\varvec{X}),Z)\le & {} \Delta _{\varepsilon }(\varvec{Q}(\varvec{X}),\varvec{Q}(\varvec{Y}))+\Delta _{\varepsilon }(\varvec{Q}(\varvec{Y}),Z)\\\lesssim & {} (\log d)^{2/3}\delta _0[\varvec{Q}(\varvec{X})]^{1/3}+(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{1/3}. \end{aligned}$$

Therefore, Proposition 3.1 yields

$$\begin{aligned}&\sup _{x\in {\mathbb {R}}^d}\left| P(\varvec{Q}(\varvec{X})\le x)-P(Z\le x)\right| \\&\quad \lesssim (\log d)^{2/3}\delta _0[\varvec{Q}(\varvec{X})]^{1/3}+(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{1/3}+\underline{\sigma }^{-1}\varepsilon \sqrt{\log d}\\&\quad \lesssim (1+\underline{\sigma }^{-1})\{(\log d)^{2/3}\delta _0[\varvec{Q}(\varvec{X})]^{1/3}+(\log d)^{\mu +\frac{1}{2}}\delta _1[\varvec{Q}(\varvec{X})]^{1/3}\} \\&\qquad +\underline{\sigma }^{-1}(\log d)^{(2\overline{q}_d-1)/\alpha +\frac{3}{2}}\max _{1\le k\le d}\overline{B}_N^{q_k}\sqrt{{\mathcal {M}}(f_k)}. \end{aligned}$$

This completes the proof. $\square $

8.2 Proof of Corollaries 2.1 and 2.2

Corollary 2.1 can be shown in an analogous manner to the proof of [18, Corollary 5.1] with applying Theorem 2.1 instead of [18, Lemma 5.1]. Corollary 2.2 immediately follows from Corollary 2.1. $\square $

8.3 Proof of Theorem 2.2

Lemma 7.2

Let $q\ge 2$ and $f:[N]^q\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals. Suppose that the sequence $\varvec{X}$ satisfies one of conditions (A)–(C). Then, we have $\kappa _4(Q(f;\varvec{X}))\ge 0$ and

$$\begin{aligned} {\mathcal {M}}(f)\le \max _{1\le r\le q-1}\Vert f\star _rf\Vert _{\ell _2}\le \frac{1}{q\cdot q!}\sqrt{\kappa _4(Q(f;\varvec{X}))}. \end{aligned}$$

(7.3)

Proof

The first inequality in (7.3) is a consequence of Eq.(1.9) in [47] (note that they define ${{\,\mathrm{Inf}\,}}_i(f)$ with dividing ours by $(q-1)!$). To prove the second inequality in (7.3), first we suppose that $\varvec{X}$ satisfies condition (A). Let $\varvec{G}=(G_i)_{i\in {\mathbb {N}}}$ be a sequence of independent standard normal variables. Then, by [45, Proposition 3.1] we have $\kappa _4(Q(f;\varvec{X}))\ge \kappa _4(Q(f;\varvec{G}))$. Therefore, (5.11) yields the desired result. Next, when $\varvec{X}$ satisfies condition (B), the desired result follows from Eq.(5.3) in [29]. Finally, when $\varvec{X}$ satisfies condition (C), the desired result follows from (5.11). Hence, we complete the proof. $\square $

Lemma 7.3

Let F, G be two random variables such that $\Vert F\Vert _{\psi _\alpha }\vee \Vert G\Vert _{\psi _\alpha }<\infty $ for some $\alpha >0$. Then, we have

$$\begin{aligned} |\mathrm {E}[|F|^r]-\mathrm {E}[|G|^r]| \le \frac{2r(\Vert F\Vert _{\psi _\alpha }^r+\Vert G\Vert _{\psi _\alpha }^r)}{\alpha b}\varvec{\Gamma }\left( \frac{r}{\alpha b}\right) \sup _{x\in {\mathbb {R}}}|P(F\le x)-P(G\le x)|^b \end{aligned}$$

for any $r\ge 1$ and $b\in (0,1)$, where $\varvec{\Gamma }$ denotes the gamma function.

Proof

This is an easy consequence of [55, Theorem 8.16] and Lemma A.3. $\square $

Proof of Theorem 2.2

The inequality $\kappa _4(Q(f;\varvec{X}))\ge 0$ is proved in Lemma 7.2. The implication (iv) $\Rightarrow $ (iii) $\Rightarrow $ (ii) is obvious. The implication (i) $\Rightarrow $ (iv) follows from Corollary 2.2 and Lemma 7.2.

It remains to prove (ii) $\Rightarrow $ (i). In view of Lemma 7.3, it is enough to prove $\sup _{n\in {\mathbb {N}}}\max _{1\le j\le d_n}(\Vert Q(f_{n,j};\varvec{X})\Vert _{\psi _{\beta }}+\Vert Z_{n,j}\Vert _{\psi _{\beta }})<\infty $ for some $\beta >0$. This follows from Lemmas 6.2 and A.5. $\square $

8.4 Proof of Lemma 2.1

Let us define the sequence of random variables $(Y_i)_{i=1}^N$ in the same way as in the proof of Theorem 2.1. Then, since $\mathrm {E}[Y_i^4]=3+\frac{3}{2}|\mathrm {E}[X_i^3]|\le \frac{9}{2}M$, Lemma 7.1 yields

$$\begin{aligned} |\mathrm {E}[Q(f;\varvec{X})^4]-\mathrm {E}[Q(f;\varvec{Y})^4]|\le C_1M^{q}\mathcal {M}(f)\Vert f\Vert _{\ell _2}^2, \end{aligned}$$

where $C_1>0$ depends only on q. Now, since $\mathrm {E}[Q(f;\varvec{X})^2]=q!\Vert f\Vert _{\ell _2}^2=\mathrm {E}[Q(f;\varvec{Y})^2]$ and $\sqrt{\kappa _4(Q(f;\varvec{Y}))}\ge q\cdot q!\max _{1\le r\le q-1}\Vert f\star _rf\Vert _{\ell _2}$ by Lemma 7.2, we obtain the desired result. $\square $

8.5 Proof of Proposition 2.3

Lemma 7.4

Let $\varvec{X}=(X_i)_{i=1}^N$ be a sequence of independent centered random variables with unit variance and such that $M:=\max _{1\le i\le N}\mathrm {E}[X_i^4]<\infty $. Also, let $f:[N]^2\rightarrow {\mathbb {R}}$ be a symmetric function vanishing on diagonals. Then, we have

$$\begin{aligned} |\kappa _4(Q(f;\varvec{X}))|\le C\left\{ (1+M)\Vert f\Vert _{\ell _2}^2\mathcal {M}(f)+{{\,\mathrm{tr}\,}}([f]^4)\right\} , \end{aligned}$$

where $C>0$ is a universal constant.

Proof

By Proposition 3.1 and Eq.(3.1) in [21], we have

$$\begin{aligned} |\mathrm {E}[Q(f;\varvec{X})^4]-6G_{\text {V}}|\le G_{\text {I}}+18G_{\text {II}}+24|G_{\text {IV}}|, \end{aligned}$$

where

$$\begin{aligned} G_{\text {I}}&= 2^3\sum _{(i,j)\in \Delta _2^N}f(i,j)^4\mathrm {E}[X_i^4]\mathrm {E}[X_j^4] , \\ G_{\text {II}}&= 2^3\sum _{(i,j,k)\in \Delta _3^N}f(i,j)^2f(i,k)^2\mathrm {E}[X_i^4],\\ G_{\text {IV}}&= 2\sum _{(i,j,k,l)\in \Delta _4^N}f(i,j)f(i,k)f(l,j)f(l,k) ,\\ G_{\text {V}}&= 2\sum _{(i,j,k,l)\in \Delta _4^N}f(i,j)^2f(k,l)^2. \end{aligned}$$

Since we have $2\Vert f\Vert _{\ell _2}^4-G_{\text {V}}\le 8\Vert f\Vert _{\ell _2}^2\mathcal {M}(f)$, it holds that $|\kappa _4(Q(f;\varvec{X}))|\le |\mathrm {E}[Q(f;\varvec{X})^4]-6G_{\text {V}}|+48\Vert f\Vert _{\ell _2}^2\mathcal {M}(f)$. Meanwhile, a straightforward computation yields

$$\begin{aligned} {{\,\mathrm{tr}\,}}([f]^4)&=\sum _{(i,j)\in \Delta _2^N}f(i,j)^4 +2\sum _{(i,j,k)\in \Delta _3^N}f(i,k)^2f(j,k)^2\\&\quad +\sum _{(i,j,k,l)\in \Delta _4^N}f(i,k)f(j,k)f(i,l)f(j,l). \end{aligned}$$

Hence, we obtain

$$\begin{aligned}&|\mathrm {E}[Q(f;\varvec{X})^4]-6G_{\text {V}}| \\&\quad \le C_1\left\{ (1+M)\left( \sum _{i,k=1}^Nf(i,k)^4+\sum _{(i,j,k)\in \Delta _3^N}f(i,k)^2f(j,k)^2\right) \right. \\&\qquad \left. +{{\,\mathrm{tr}\,}}([f]^4)\right\} , \end{aligned}$$

where $C_1>0$ is a universal constant. Since it holds that

$$\begin{aligned} \max \left\{ \sum _{(i,j)\in \Delta _2^N}f(i,j)^4,\sum _{(i,j,k)\in \Delta _3^N}f(i,k)^2f(j,k)^2\right\} \le \Vert f\Vert _{\ell _2}^2\mathcal {M}(f), \end{aligned}$$

we obtain the desired result. $\square $

Proof of Proposition 2.3

The desired result immediately follows from Corollary 2.2 and Lemma 7.4. $\square $

8.6 Proof of Proposition 2.5

Define the $n_1\times n_2$ matrix $\Xi _n(\theta )$ by $\Xi _n(\theta )= (\frac{1}{2}\Delta _i^nX^1K^{ij}_\theta \Delta _j^nX^2)_{i,j},$ and set

$$\begin{aligned} {\widetilde{\Xi }}_n(\theta )= \left( \begin{array}{cc} O &{} \Xi _n(\theta ) \\ \Xi _n(\theta )^\top &{} O \end{array} \right) . \end{aligned}$$

Note that $U_n^*(\theta )={\varvec{w}}^\top {\widetilde{\Xi }}_n(\theta ){\varvec{w}}$ with ${\varvec{w}}=((w^1_i)_{i=1}^{n_1},(w^2_j)_{j=1}^{n_2})^\top $. Hence, by Proposition 2.3, it suffices to prove

$$\begin{aligned}&\log ^2(\#{\mathcal {G}}_n)\cdot \sqrt{n}\max _{\theta ,\theta '\in {\mathcal {G}}_n}\left| \mathrm {E}[U_n(\theta )U_n(\theta ')]-\mathrm {E}[U^*_n(\theta )U^*_n(\theta ')\mid X]\right| \rightarrow ^p0, \end{aligned}$$

(7.4)

$$\begin{aligned}&\log ^5(\#{\mathcal {G}}_n)\max _{\theta \in {\mathcal {G}}_n}\sqrt{{{\,\mathrm{tr}\,}}\left( {\widetilde{\Xi }}_n(\theta )^4\right) }\rightarrow ^p0, \end{aligned}$$

(7.5)

$$\begin{aligned}&\log ^5(\#{\mathcal {G}}_n)\cdot \max _{\theta \in {\mathcal {G}}_n}\nonumber \\&\quad \sqrt{\max _{1\le i\le n_1}n\sum _{j=1}^{n_2}(\Delta ^n_iX^1)^2(\Delta ^n_jX^2)^2K^{ij}_\theta +\max _{1\le j\le n_2}n\sum _{i=1}^{n_1}(\Delta ^n_iX^1)^2(\Delta ^n_jX^2)^2K^{ij}_\theta }\rightarrow ^p0. \end{aligned}$$

(7.6)

(7.4) and (7.5) are established in the proof of [35, Proposition B.8] under the current assumptions. Moreover, as in bounding the quantity $\mathrm {E}[R_{n,1}^*]$ in the proof of [35, Proposition B.8], we deduce for any $p\ge 1$

$$\begin{aligned}&{{\,\mathrm{E}\,}}\left[ \left| \max _{\theta \in {\mathcal {G}}_n}\max _{i}n\sum _{j}(\Delta _i^n X^1)^2(\Delta _j^n X^2)^2K_\theta ^{ij}\right| ^p\right] \\&\quad \le n^p\sum _i{{\,\mathrm{E}\,}}\left[ (\Delta _i^n X^1)^{2p}\max _{\theta \in {\mathcal {G}}_n}\left| \sum _{j}(\Delta _j^n X^2)^2K_\theta ^{ij}\right| ^p\right] \\&\quad \le n^p\sum _i\sqrt{{{\,\mathrm{E}\,}}\left[ (\Delta _i^n X^1)^{4p}\right] {{\,\mathrm{E}\,}}\left[ \max _{\theta \in {\mathcal {G}}_n}\left| \sum _{j}(\Delta _j^n X^2)^2K_\theta ^{ij}\right| ^{2p}\right] }\\&\quad =O\left( n^pr_n^{p-1}(r_n\log (\#{\mathcal {G}}_n))^p\right) . \end{aligned}$$

Exchanging $X^1$ and $X^2$, we obtain a similar estimate. Hence, (7.6) holds by assumption. $\square $

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

Azmoodeh, E., Campese, S., Poly, G.: Fourth Moment Theorems for Markov diffusion generators. J. Funct. Anal. 266, 2341–2359 (2014)
MathSciNet MATH Google Scholar
Azmoodeh, E., Malicet, D., Mijoule, G., Poly, G.: Generalization of the Nualart–Peccati criterion. Ann. Probab. 44, 924–954 (2016)
MathSciNet MATH Google Scholar
Azmoodeh, E., Peccati, G.: Malliavin–Stein method: a survey of recent developments. Working paper. arXiv:1809.01912 (2018)
Azmoodeh, E., Peccati, G., Poly, G.: The law of iterated logarithm for subordinated Gaussian sequences: uniform Wasserstein bounds. ALEA Lat. Am. J. Probab. Math. Stat. 13, 659–686 (2016)
MathSciNet MATH Google Scholar
Bakry, D., Gentil, I., Ledoux, M.: Analysis and Geometry of Markov Diffusion Operators. Springer, Berlin (2014)
MATH Google Scholar
Belloni, A., Chernozhukov, V., Chetverikov, D., Hansen, C., Kato, K.: High-dimensional econometrics and regularized GMM. Working paper. (2018) arXiv:1806.01888
Bentkus, V.: On the dependence of the Berry–Esseen bound on dimension. J. Statist. Plann. Inference 113, 385–402 (2003)
MathSciNet MATH Google Scholar
Bentkus, V., Götze, F., Paulauskas, V., Račkauskas, A.: The accuracy of Gaussian approximation in Banach spaces. In: Prokhorov, Y., Statulevicius, V. (eds.) Limit Theorems of Probability Theory, Chapter II, pp. 25–111. Springer, Berlin (2000)
MATH Google Scholar
Campese, S., Nourdin, I., Peccati, G., Poly, G.: Multivariate Gaussian approximations on Markov chaoses. Electron. Commun. Probab. 21, 1–9 (2016)
MathSciNet MATH Google Scholar
Chen, X.: Gaussian and bootstrap approximations for high-dimensional U-statistics and their applications. Ann. Stat. 46, 642–678 (2018)
MathSciNet MATH Google Scholar
Chen, X., Kato, K.: Randomized incomplete $U$-statistics in high dimensions. Ann. Stat. 47, 3127–3156 (2019)
MathSciNet MATH Google Scholar
Chen, X., Kato, K.: Jackknife multiplier bootstrap: finite sample approximations to the $U$-process supremum with applications. Probab. Theory Relat. Fields 176, 1097–1163 (2020)
MathSciNet MATH Google Scholar
Chernozhukov, V., Chetverikov, D., Kato, K.: Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 41, 2786–2819 (2013)
MathSciNet MATH Google Scholar
Chernozhukov, V., Chetverikov, D., Kato, K.: Supplement to “Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors”. https://doi.org/10.1214/13-AOS1161SUPP (2013)
Chernozhukov, V., Chetverikov, D., Kato, K.: Gaussian approximation of suprema of empirical processes. Ann. Stat. 42, 1564–1597 (2014)
MathSciNet MATH Google Scholar
Chernozhukov, V., Chetverikov, D., Kato, K.: Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Probab. Theory Relat. Fields 162, 47–70 (2015)
MathSciNet MATH Google Scholar
Chernozhukov, V., Chetverikov, D., Kato, K.: Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related Gaussian couplings. Stoch. Process. Appl. 126, 3632–3651 (2016)
MathSciNet MATH Google Scholar
Chernozhukov, V., Chetverikov, D., Kato, K.: Central limit theorems and bootstrap in high dimensions. Ann. Probab. 45, 2309–2353 (2017)
MathSciNet MATH Google Scholar
Chernozhukov, V., Chetverikov, D., Kato, K.: Detailed proof of Nazarov’s inequality. Unpublished paper. arXiv:1711.10696 (2017)
Courtade, T.A., Fathi, M., Pananjady, A.: Existence of Stein kernels under a spectral gap, and discrepancy bounds. Ann. Inst. Henri Poincaré Probab. Stat. 55, 777–790 (2019)
MathSciNet MATH Google Scholar
de Jong, P.: A central limit theorem for generalized quadratic forms. Probab. Theory Relat. Fields 75, 261–277 (1987)
MathSciNet MATH Google Scholar
de Jong, P.: A central limit theorem for generalized multilinear forms. J. Multivariate Anal. 34, 275–289 (1990)
MathSciNet MATH Google Scholar
de la Pena, V.H., Montgomery-Smith, S.J.: Decoupling inequalities for the tail probabilities of multivariate $U$-statistics. Ann. Probab. 23, 806–816 (1995)
MathSciNet MATH Google Scholar
Deng, H., Zhang, C.-H.: Beyond Gaussian approximation: Bootstrap for maxima of sums of independent random vectors. Ann. Stat. 48, 3643–3671 (2020)
MathSciNet MATH Google Scholar
Dette, H., Hetzler, B.: Specification tests indexed by bandwidths. Sankhyā Indian J. Stat. 69, 28–54 (2007)
MathSciNet MATH Google Scholar
Dirksen, S.: Tail bounds via generic chaining. Electron. J. Probab. 20, 1–29 (2015)
MathSciNet MATH Google Scholar
Döbler, C., Krokowski, K.: On the fourth moment condition for Rademacher chaos. Ann. Inst. Henri Poincaré Probab. Stat. 55, 61–97 (2019)
MathSciNet MATH Google Scholar
Döbler, C., Peccati, G.: Quantitative de Jong theorems in any dimension. Electron. J. Probab. 22, 1–35 (2017)
MathSciNet MATH Google Scholar
Döbler, C., Vidotto, A., Zheng, G.: Fourth moment theorems on the Poisson space in any dimension. Electron. J. Probab. 23, 1–27 (2018)
MathSciNet MATH Google Scholar
Götze, F., Tikhomirov, A.N.: Asymptotic distribution of quadratic forms. Ann. Probab. 27, 1072–1098 (1999)
MathSciNet MATH Google Scholar
Hitczenko, P., Montgomery-Smith, S., Oleszkiewicz, K.: Moment inequalities for sums of certain independent symmetric random variables. Studia Math. 123, 15–42 (1997)
MathSciNet MATH Google Scholar
Horowitz, J.L., Spokoiny, V.G.: An adaptive, rate-optimal test of a parametric mean-regression model against a nonparametric alternative. Econometrica 69, 599–631 (2001)
MathSciNet MATH Google Scholar
Janson, S.: Gaussian Hilbert Space. Cambridge University Press, Cambridge (1997)
MATH Google Scholar
Koike, Y.: Mixed-normal limit theorems for multiple Skorohod integrals in high-dimensions, with application to realized covariance. Electron. J. Stat. 13, 1443–1522 (2019)
MathSciNet MATH Google Scholar
Koike, Y.: Gaussian approximation of maxima of Wiener functionals and its application to high-frequency data. Ann. Stat. 47, 1663–1687 (2019)
MathSciNet MATH Google Scholar
Kuchibhotla, A.K., Chakrabortty, A.: Moving beyond sub-Gaussianity in high-dimensional statistics: applications in covariance estimation and linear regression. Working paper. arXiv:1804.02605 (2018)
Kwapień, S., Woyczyński, W.A.: Random Series and Stochastic Integrals: Single and Multiple. Birkhäuser, Basel (1992)
MATH Google Scholar
Ledoux, M.: Chaos of a Markov operator and the fourth moment condition. Ann. Probab. 40, 2439–2459 (2012)
MathSciNet MATH Google Scholar
Ledoux, M., Nourdin, I., Peccati, G.: Stein’s method, logarithmic Sobolev and transport inequalities. Geom. Funct. Anal. 25, 256–306 (2015)
Liu, M., Shang, Z., Cheng, G.: Nonparametric testing under random projection. Working paper. arXiv:1802.06308 (2018)
Mossel, E., O’Donnell, R., Oleszkiewicz, K.: Noise stability of functions with low influences: invariance and optimality. Ann. Math. 171, 295–341 (2010)
Nourdin, I., Peccati, G.: Stein’s method and exact Berry–Esseen asymptotics for functionals of Gaussian fields. Ann. Probab. 37, 2231–2261 (2009)
Nourdin, I., Peccati, G.: Stein’s method on Wiener chaos. Probab. Theory Relat. Fields 145, 75–118 (2009)
Nourdin, I., Peccati, G.: Normal Approximations with Malliavin Calculus: From Stein’s Method to Universality. Cambridge University Press, Cambridge (2012)
Nourdin, I., Peccati, G., Poly, G., Simone, R.: Classical and free fourth moment theorems: Universality and thresholds. J. Theoret. Probab. 29, 653–680 (2016)
MathSciNet MATH Google Scholar
Nourdin, I., Peccati, G., Poly, G., Simone, R.: Multidimensional limit theorems for homogeneous sums: a survey and a general transfer principle. ESAIM Probab. Stat. 20, 293–308 (2016)
MathSciNet MATH Google Scholar
Nourdin, I., Peccati, G., Reinert, G.: Invariance principles for homogeneous sums: Universality of Gaussian Wiener chaos. Ann. Probab. 38, 1947–1985 (2010)
MathSciNet MATH Google Scholar
Nourdin, I., Peccati, G., Reinert, G.: Stein’s method and stochastic analysis of Rademacher functionals. Electron. J. Probab. 15, 1703–1742 (2010)
Nualart, D., Peccati, G.: Central limit theorems for sequences of multiple stochastic integrals. Ann. Probab. 33, 177–193 (2005)
MathSciNet MATH Google Scholar
Paulauskas, V.: A note on the rate of convergence in the CLT for empirical processes. Lith. Math. J. 32, 312–316 (1992)
MathSciNet MATH Google Scholar
Peccati, G., Tudor, C.A.: Gaussian limits for vector-valued multiple stochastic integrals. In: Émery, M., Ledoux, M., Yor, M. (eds.), Séminaire de probabilitiés XXXVIII, vol. 1857 of Lecture Notes in Math. Springer, pp. 247–262 (2005)
Peccati, G., Zheng, C.: Universal Gaussian fluctuations on the discrete Poisson chaos. Bernoulli 20, 697–715 (2014)
MathSciNet MATH Google Scholar
Rotar, V.I.: Limit theorems for multilinear forms and quasipolynomial functions. Theory Probab. Appl. 20, 512–532 (1975)
Rotar’, V.I.: Limit theorems for polylinear forms. J. Multivariate Anal. 9, 511–530 (1979)
Rudin, W.: Real and Complex Analysis, 3rd edn. McGraw-Hill, New York (1987)
MATH Google Scholar
Song, Y., Chen, X., Kato, K.: Approximating high-dimensional infinite-order $U$-statistics: statistical and computational guarantees. Electron. J. Stat. 13, 4794–4848 (2019)
MathSciNet MATH Google Scholar
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, Berlin (1996)
MATH Google Scholar
Zhai, A.: A high-dimensional CLT in ${\cal{W}}_2$ distance with near optimal convergence rate. Probab. Theory Relat. Fields 170, 821–845 (2018)
MATH Google Scholar
Zheng, G.: A Peccati–Tudor type theorem for Rademacher chaoses. ESAIM Probab. Stat. 23, 874–892 (2019)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author is grateful to an anonymous referee for his or her constructive comments. Thanks are also due to the participants at the Osaka Probability Seminar on November 28, 2017, for insightful comments. The author also thanks Professor Giovanni Peccati for having indicated that the same type bound as Corollary 3.1 has already appeared in [4, Theorem 3.1].

Open Access

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Funding

This study was funded by JST CREST (Grant Number JPMJCR14D7) and JSPS KAKENHI (Grant Numbers JP16K17105, JP17H01100, JP18H00836).

Author information

Authors and Affiliations

University of Tokyo and CREST JST, Tokyo, Japan
Yuta Koike

Authors

Yuta Koike
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuta Koike.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Properties of the $\psi _\alpha $-norm

Here we collect several properties of the $\psi _\alpha $-norm used in this paper. Let $\alpha $ be a positive number. Recall that the $\psi _\alpha $-norm of a random variable X is defined by

$$\begin{aligned} \Vert X\Vert _{\psi _\alpha }:=\inf \{C>0:\mathrm {E}[\psi _\alpha (|X|/C)]\le 1\}, \end{aligned}$$

(A.1)

where $\psi _\alpha (x):=\exp (x^\alpha )-1$. From the definition, we can easily deduce the following useful identity:

$$\begin{aligned} \Vert X\Vert _{\psi _\alpha }=\Vert |X|^\alpha \Vert _{\psi _1}^{1/\alpha }. \end{aligned}$$

(A.2)

Using this relation, we can derive the properties of the $\psi _\alpha $-norm from those of the $\psi _1$-norm. This is very convenient because the latter ones are well-studied in the literature. For example, since $\Vert \cdot \Vert _{\psi _1}$ satisfies the triangle inequality, we have

$$\begin{aligned} \Vert X+Y\Vert _{\psi _\alpha }\le 2^{1\vee \alpha ^{-1}-1}(\Vert X\Vert _{\psi _\alpha }+\Vert Y\Vert _{\psi _\alpha }) \end{aligned}$$

(A.3)

for any random variables X, Y. Also, using Young’s inequality for products and Hölder’s inequality, one can prove $\Vert X\Vert _{\psi _1}\le (\log 2)^{1/p-1}\Vert X\Vert _{\psi _p}$ for any random variable X and $p>1$. So we obtain $ \Vert X\Vert _{\psi _\alpha }\le (\log 2)^{1/\beta -1/\alpha }\Vert X\Vert _{\psi _\beta } $ for any $0<\alpha \le \beta <\infty $. Other useful results can be obtained from [57, Lemmas 2.2.1–2.2.2]:

Lemma A.1

Suppose that there are constants $C,K>0$ such that $P(|X|>x)\le Ke^{-Cx^\alpha }$ for all $x>0$. Then, we have $\Vert X\Vert _{\psi _\alpha }\le \left( (1+K)/C\right) ^{1/\alpha }$.

Lemma A.2

There is a universal constant $K>0$ such that

$$\begin{aligned} \left\| \max _{1\le j\le d}|X_j|\right\| _{\psi _\alpha }\le K^{1/\alpha }(\log (d+1))^{1/\alpha }\max _{1\le j\le d}\Vert X_j\Vert _{\psi _\alpha } \end{aligned}$$

for any $\alpha >0$ and random variables $X_1,\dots ,X_d$.

It is also easy to check that $\Vert X\Vert _{\psi _\alpha }$ attains the infimum in (A.1) if $\Vert X\Vert _{\psi _\alpha }<\infty $. That is, $\mathrm {E}[\psi _\alpha (|X|/\Vert X\Vert _{\psi _\alpha })]\le 1$. Therefore, the Markov inequality yields the following converse of Lemma A.1:

Lemma A.3

If $\Vert X\Vert _{\psi _\alpha }<\infty $, we have $P(|X|\ge x)\le 2e^{-(x/\Vert X\Vert _{\psi _\alpha })^\alpha }$ for every $x>0$.

Next, we investigate the relation between the $\psi _\alpha $-norm and moment growth. First, [26, Lemma A.1] yields the following result:

Lemma A.4

If there is a constant $A>0$ such that $\Vert X\Vert _p\le Ap^{1/\alpha }$ for all $p\ge 1$, then $ P(|X|\ge x)\le e^{1/\alpha }e^{-(\alpha e)^{-1}(x/A)^\alpha } $ for every $x>0$.

Combining Lemma A.4 with Lemma A.1, we obtain the following result:

Lemma A.5

Suppose that there is a constant $A>0$ such that $\Vert X\Vert _p\le Ap^{1/\alpha }$ for all $p\ge 1$. Then, we have $\Vert X\Vert _{\psi _\alpha }\le \left( (1+e^{1/\alpha })\alpha e\right) ^{1/\alpha }A$.

Lemma A.3 and [26, Lemma A.2] yield the following converse of Lemma A.5:

Lemma A.6

For all $p\ge 1$, it holds that $\Vert X\Vert _p\le c_\alpha \Vert X\Vert _{\psi _\alpha }p^{1/\alpha }$ with $ c_\alpha :=e^{1/2e-1/\alpha }\alpha ^{-1/\alpha }\max \left\{ 1,2\sqrt{\frac{2\pi }{\alpha }}e^{\alpha /12}\right\} . $

Finally, we have the following Hölder-type inequality for the $\psi _\alpha $-norm:

Lemma A.7

([36], Proposition S.3.2) Let $X_1,X_2$ be two random variables such that $\Vert X_1\Vert _{\psi _{\alpha _1}}+\Vert X_2\Vert _{\psi _{\alpha _2}}<\infty $ for some $\alpha _1,\alpha _2>0$. Then, we have $ \Vert X_1X_2\Vert _{\psi _\alpha }\le \Vert X_1\Vert _{\psi _{\alpha _1}}\Vert X_2\Vert _{\psi _{\alpha _2}}, $ where $\alpha >0$ is defined by the equation $1/\alpha =1/\alpha _1+1/\alpha _2$.

Proof of Lemma 6.2

Lemma B.1

(Strong domination) Let $(\xi _i)_{i=1}^N$ and $(\theta _i)_{i=1}^N$ be two sequences of independent symmetric random variables. Suppose that there is an integer $k>0$ such that $P(|\xi _i|>t)\le k P(|\theta _i|>t)$ for all $i\in [N]$ and $t>0$. Then, for any $p\ge 1$ and $a_1,\dots ,a_N\in {\mathbb {R}}$ we have

$$\begin{aligned} \left\| \sum _{i=1}^Na_i\xi _i\right\| _p \le (2k)^{1/p}k\left\| \sum _{i=1}^Na_i\theta _i\right\| _p. \end{aligned}$$

Proof

This is a consequence of Theorem 3.2.1 and Corollary 3.2.1 in [37]. $\square $

Lemma B.2

Let $(\xi _i)_{i\in {\mathbb {N}}}$ be a sequence of independent copies of a symmetric random variable $\xi $ satisfying $P(|\xi |\ge t)=e^{-|t|^\alpha }$ for every $t\ge 0$ and some $0<\alpha \le 2$. Then, there is a constant $C_\alpha >0$ which depends only on $\alpha $ such that

$$\begin{aligned} \left\| \sum _{i=1}^Na_i\xi _i\right\| _p\le C_\alpha p^{1/\alpha }\sqrt{\sum _{i=1}^Na_i^2} \end{aligned}$$

for any $p\ge 2$, $N\in {\mathbb {N}}$ and $a_1,\dots ,a_N\in {\mathbb {R}}$.

Proof

Note that, by Lemmas A.1 and A.6, there is a constant $K_\alpha >0$ depending only on $\alpha $ such that $\Vert \xi \Vert _r\le K_\alpha r^\alpha $ for all $r\ge 1$. Then, the result follows from [31, Theorem 1.1] when $\alpha \le 1$ and the Gluskin–Kwapień inequality (cf. page 17 of [31]) when $\alpha >1$. $\square $

Lemma B.3

Let $(\zeta _i)_{i=1}^N$ be a sequence of independent centered random variables such that $M:=\max _{1\le i\le N}\Vert \zeta _i\Vert _{\psi _\alpha }<\infty $ for some $0<\alpha \le 2$. Then, we have

$$\begin{aligned} \left\| \sum _{i=1}^Na_i\zeta _i\right\| _p\le K_\alpha Mp^{1/\alpha }\sqrt{\sum _{i=1}^Na_i^2} \end{aligned}$$

for any $p\ge 1$ and $a_1,\dots ,a_N\in {\mathbb {R}}$, where $K_\alpha >0$ depends only on $\alpha $.

Proof

Thanks to symmetrization inequalities (see, e.g., [57, Lemma 2.3.1]), it suffices to consider the case that $\zeta _i$ is symmetric for all i. Then, the result follows from Lemmas A.3, B.1 and B.2 . $\square $

Lemma B.4

Let $(X_{i,j})_{1\le i\le N,1\le j\le q}$ be an array of independent centered random variables. Assume $M:=\max _{1\le i\le N,1\le j\le q}\Vert X_{i,j}\Vert _{\psi _\alpha }<\infty $ for some $0<\alpha \le 2$. Then,

$$\begin{aligned} \left\| \sum _{i_1,\dots ,i_q=1}^Nf(i_1,\dots ,i_q)X_{i_1,1}\cdots X_{i_q,q}\right\| _p \le K_{\alpha }^qp^{q/\alpha }M^q\Vert f\Vert _{\ell _2} \end{aligned}$$

for any $p\ge 2$ and $f:[N]^q\rightarrow {\mathbb {R}}$, where $K_{\alpha }>0$ is the same as in Lemma B.3.

Proof

This can easily be shown by induction on q and using Lemma B.3. $\square $

Proof of Lemma 6.2

The claim is an immediate consequence of [23, Theorem 1], [55, Theorem 8.16] and Lemma B.4. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Koike, Y. High-Dimensional Central Limit Theorems for Homogeneous Sums. J Theor Probab 36, 1–45 (2023). https://doi.org/10.1007/s10959-022-01156-2

Download citation

Received: 31 May 2019
Revised: 19 September 2021
Accepted: 04 January 2022
Published: 26 February 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10959-022-01156-2

Keywords

Mathematics Subject Classification (2020)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

High-Dimensional Central Limit Theorems for Homogeneous Sums

Abstract

Similar content being viewed by others

Convergence arguments to bridge cauchy and matérn covariance functions

Strong asymptotic freeness for independent uniform variables on compact groups associated to nontrivial representations

Hermite polynomials linking Szász–Durrmeyer operators

1 Introduction

2 Notation

3 Main Results

Theorem 2.1

Remark 2.1

Corollary 2.1

Corollary 2.2

Theorem 2.2

Remark 2.2

3.1 Comparison of Theorem 2.1 to Some Existing Results

3.1.1 Comparison to Corollary 7.3 in Nourdin, Peccati and Reinert [47]

Proposition 2.1

Lemma 2.1

Remark 2.3

Remark 2.4

3.1.2 Comparison to Theorem 3.2 in Koike [35]

Proposition 2.2

Proposition 2.3

Remark 2.5

3.2 Statistical Application: Bootstrap Test for the Absence of Lead–Lag Relationship

Proposition 2.4

Proposition 2.5

4 Chernozhukov–Chetverikov–Kato Theory

Lemma 3.1

Remark 3.1

Lemma 3.2

Proposition 3.1

Proof

Remark 3.2

Definition 3.1

Corollary 3.1

Proof

Remark 3.3

5 Stein Kernels and High-Dimensional CLTs

Definition 4.1

Remark 4.1

Lemma 4.1

Proof

Proposition 4.1

Proof

6 A High-Dimensional CLT for Normal-Gamma Homogeneous Sums

Proposition 5.1

6.1 \(\Gamma \)-Calculus

Lemma 5.1

6.2 A Bound for the Variance of the Carré du Champ Operator

Proposition 5.2

Lemma 5.2

Lemma 5.3

Proof

Lemma 5.4

Proof

Proof of Proposition 5.2

6.3 Proof of Proposition 5.1

7 Randomized Lindeberg Method

Proposition 6.1

Remark 6.1

Lemma 6.1

Proof

Lemma 6.2

Lemma 6.3

Proof of Proposition 6.1

8 Proof of the Main Results

8.1 Proof of Theorem 2.1

Lemma 7.1

Proof of Theorem 2.1

8.2 Proof of Corollaries 2.1 and 2.2

8.3 Proof of Theorem 2.2

Lemma 7.2

Proof

Lemma 7.3

Proof

Proof of Theorem 2.2

8.4 Proof of Lemma 2.1

8.5 Proof of Proposition 2.3