Abstract
Hwang’s quasi-power theorem asserts that a sequence of random variables whose moment generating functions are approximately given by powers of some analytic function is asymptotically normally distributed. This theorem is generalised to higher dimensional random variables. To obtain this result, a higher dimensional analogue of the Berry–Esseen inequality is proved, generalising a two-dimensional version by Sadikova.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Asymptotic normality is a frequently occurring phenomenon, the classical central limit theorem being the very first example. The first step in the proof is the observation that the moment generating function of the sum of n identically independently distributed random variables is the n-th power of the moment generating function of the distribution underlying the summands. As similar moment generating functions occur in many examples in combinatorics, a general theorem to prove asymptotic normality is desirable. Such a theorem was proved by Hwang [17], usually called the “quasi-power theorem”.
Theorem
(Hwang [17]) Let \(\{\Omega _n\}_{n\ge 1}\) be a sequence of integer-valued random variables. Suppose that the moment generating function satisfies the asymptotic expression
the O-term being uniform for \(|s |\le \tau \), \(s\in \mathbb {C}\), \(\tau >0\), where
-
(1)
\(W_n(s)=u(s)\phi _{n}+v(s)\), with u(s) and v(s) analytic for \(|s |\le \tau \) and independent of n; and \(u''(0)\ne 0\);
-
(2)
\(\lim _{n\rightarrow \infty }\phi _{n}=\infty \);
-
(3)
\(\lim _{n\rightarrow \infty }\kappa _n=\infty \).
Then the distribution of \(\Omega _n\) is asymptotically normal, i.e.,
where \(\Phi \) denotes the standard normal distribution
See Hwang’s article [17] as well as Flajolet-Sedgewick [6, Sec. IX.5] for many applications of this theorem. A generalisation of the quasi-power theorem to dimension 2 has been provided in [12]. It has been used in [4, 14,15,16, 18]. In [3, Thm. 2.22], an m-dimensional version of the quasi-power theorem is stated without speed of convergence. Also in [1], such an m-dimensional theorem without speed of convergence is proved. There, several multidimensional applications are given, too.
In contrast to many results about the speed of convergence in classical probability theory (see, e.g., [11]), the sequence of random variables is not assumed to be independent. The only assumption is that the moment generating function behaves asymptotically like a large power. This mirrors the fact that the moment generating function of the sum of independent, identically distributed random variables is exactly a large power. The advantage is that the asymptotic expression (1.1) arises naturally in combinatorics by using techniques such as singularity analysis or saddle point approximation (see [6]).
The purpose of this article is to generalise the quasi-power theorem including the speed of convergence to arbitrary dimension m. We first state this main result in Theorem 1 in this section. In Sect. 2, a new Berry–Esseen inequality (Theorem 2) is presented, which we use to prove the m-dimensional quasi-power theorem. The combinatorial idea behind the formulation of the Berry–Esseen inequality is discussed in Sect. 3. Our Berry–Esseen bound is proved in Sect. 4. The final Sect. 5 is then devoted to the proof of the quasi-power theorem. Examples are given in the extended abstract [13].
We use the following conventions: vectors are denoted by boldface letters such as \(\mathbf {s}\), their components are then denoted by regular letters with indices such as \(s_j\). For a vector \(\mathbf {s}\), \(\Vert \mathbf {s}\Vert \) denotes the maximum norm \(\max \{|s_j |\}\). The standard inner product on \(\mathbb {C}^m\) (with linearity in the second argument) is denoted by \(\langle \,\cdot \,, \,\cdot \,\rangle \). All implicit constants of O-terms may depend on the dimension m as well as on \(\tau \) which is introduced in Theorem 1.
Our first main result is the following m-dimensional version of Hwang’s theorem.
Theorem 1
Let \(\{\varvec{\Omega }_n\}_{n\ge 1}\) be a sequence of m-dimensional real random vectors. Suppose that the moment generating function satisfies the asymptotic expression
the O-term being uniform for \(||\mathbf {s} ||\le \tau \), \(\mathbf {s}\in \mathbb {C}^m\), \(\tau >0\), where
-
(1)
\(W_n(\mathbf {s})=u(\mathbf {s})\phi _{n}+v(\mathbf {s})\), with \(u(\mathbf {s})\) and \(v(\mathbf {s})\) analytic for \(||\mathbf {s} ||\le \tau \) and independent of n; and the Hessian \(H_u(\varvec{0})\) of u at the origin is non-singular;
-
(2)
\(\lim _{n\rightarrow \infty }\phi _{n}=\infty \);
-
(3)
\(\lim _{n\rightarrow \infty }\kappa _n=\infty \).
Then, the distribution of \(\varvec{\Omega }_n\) is asymptotically normal with speed of convergence \(O(\phi _n^{-1/2})\), i.e.,
where \(\Phi _{\Sigma }\) denotes the distribution function of the non-degenerate m-dimensional normal distribution with mean \(\varvec{0}\) and variance-covariance matrix \(\Sigma \), i.e.,
where \(\mathbf {y}\le \mathbf {x}\) means \(y_\ell \le x_\ell \) for \(1\le \ell \le m\).
If \(H_{u}(\varvec{0})\) is singular, the random variables
converge in distribution to a degenerate normal distribution with mean \(\varvec{0}\) and variance-covariance matrix \(H_{u}(\varvec{0})\).
Note that in the case of singular \(H_{u}(\varvec{0})\), a uniform speed of convergence cannot be guaranteed. To see this, consider the (constant) sequence of random variables \(\Omega _{n}\) which takes values \(\pm \,1\) each with probability 1 / 2. Then the moment generating function is \((e^{s}+e^{-s})/2\), which is of the form (1.2) with \(\phi _{n}=n\), \(u(s)=0\), \(v(s)=\log (e^{s}+e^{-s})/2\) and \(\kappa _{n}\) arbitrary. However, the distribution function of \(\Omega _{n}/\sqrt{n}\) is given by
which does not converge uniformly.
In contrast to the original quasi-power theorem, the error term in our result does not contain the summand \(O(1/\kappa _n)\). In fact, this summand could also be omitted in the original proof of the quasi-power theorem by using a better estimate for the error \(E_n(s)=M_n(s)e^{-W_{n}(s)}-1\), cf. the proof of our Lemma 5.1.
The order of the error is optimal (without further assumptions on the random variables), as it is the case for the one-dimensional Berry–Esseen inequality. See, for example, the approximation of a binomial distribution by the normal distribution [20, § 1.2].
The proof of Theorem 1 relies on an m-dimensional Berry–Esseen inequality (Theorem 2). It is a generalisation of Sadikova’s result [23, 24] in dimension 2. The main challenge is to provide a version which leads to bounded integrands around the origin, but still allows to use excellent bounds for the tails of the characteristic functions. To achieve this, linear combinations involving all partitions of the set \(\{1,\ldots , m\}\) are used.
Note that there are several generalisations of the one-dimensional Berry–Esseen inequality [2, 5] to arbitrary dimension, see, e.g., Gamkrelidze [7, 8] and Prakasa Rao [21]. However, using these results would lead to a less precise error term in (1.3), see the end of Sect. 2 for more details. For that reason we generalise Sadikova’s result, which was already successfully used by the first author in [12] to prove a 2-dimensional quasi-power theorem. Also note that our theorem can deal with discrete random variables, too, in contrast to [22], where density functions are considered.
For the sake of completeness, we also state the following result about the moments of \(\varvec{\Omega }_{n}\).
Proposition 1.1
The cross-moments of \(\varvec{\Omega }_{n}\) satisfy
for \(k_{\ell }\) nonnegative integers, where \(p_{\mathbf {k}}\) is a polynomial of degree \(\sum _{\ell =1}^{m}k_{\ell }\) defined by
In particular, the mean and the variance-covariance matrix are
respectively.
2 A Berry–Esseen inequality
This section is devoted to a generalisation of Sadikova’s Berry–Esseen inequality [23, 24] in dimension 2 to dimension m. Before stating the theorem, we introduce our notation.
We use Iverson’s convention
popularized by Graham, Knuth, and Patashnik [10]. As in [10], we consider \([ expr ]\) to be “very strongly zero” when \( expr \) is false; such that even products of \([ expr ]\) with an undefined quantity vanish in that case.
Let \(L=\{1,\ldots , m\}\). For \(K\subseteq L\), we write \(\mathbf {s}_K=(s_k)_{k\in K}\) for the projection of \(\mathbf {s}\in \mathbb {C}^L\) to \(\mathbb {C}^K\). For \(J\subseteq K\subseteq L\), let \(\chi _{J,K}:\mathbb {C}^{J}\rightarrow \mathbb {C}^{K}\), \((s_{j})_{j\in J}\mapsto (s_{k}[k\in J ])_{k\in K}\) be an injection from \(\mathbb {C}^{J}\) into \(\mathbb {C}^{K}\). Similarly, let \(\psi _{J,K}:\mathbb {C}^{K}\rightarrow \mathbb {C}^{K}\), \((s_{k})_{k\in K}\mapsto (s_{k}[k\in J ])_{k\in K}\) be the projection which sets all coordinates corresponding to \(K\setminus J\) to 0.
We denote the set of all partitions of K by \(\Pi _K\). We consider a partition as a set \(\alpha =\{J_{1},\ldots ,J_{k}\}\). Thus \(|\alpha |\) denotes the number of parts of the partition \(\alpha \). Furthermore, \(J\in \alpha \) means that J is a part of the partition \(\alpha \).
Now, we can define an operator which we later use to state our Berry–Esseen inequality. The motivation behind this definition is explained at the end of this section.
Definition 2.1
Let \(K\subseteq L\) and \(h:\mathbb {C}^K\rightarrow \mathbb {C}\). We define the non-linear operator
where
We denote \(\Lambda _{L}\) briefly by \(\Lambda \).
For any random variable \(\mathbf {Z}\), we denote its cumulative distribution function by \(F_\mathbf {Z}\), its density function by \(f_\mathbf {Z}\) (if it exists) and its characteristic function by \(\varphi _\mathbf {Z}\).
With these definitions, we are able to state our second main result, an m-dimensional version of the Berry–Esseen inequality.
Theorem 2
Let \(m\ge 1\) and \(\mathbf {X}\) and \(\mathbf {Y}\) be m-dimensional random variables. Assume that \(F_\mathbf {Y}\) is differentiable.
Let
for \(1\le j\le m\) where denotes a Stirling partition number (Stirling number of the second kind).
Then for every \(T>0\),
holds. Existence of \(\mathbb {E}(\mathbf {X})\) and \(\mathbb {E}(\mathbf {Y})\) is sufficient for the finiteness of the integral in (2.1).
Let us give two remarks on the distribution functions occurring in this theorem: The distribution function \(F_\mathbf {Y}\) is non-decreasing in every variable, thus \(A_j>0\) for all j. Furthermore, our general notations imply that \(F_{\mathbf {X}_J}\) is a marginal distribution of \(\mathbf {X}\).
The numbers \(B_j\) are known as “Fubini numbers” or “ordered Bell numbers”. They form the sequence A000670 in [19].
Recursive application of (2.1) leads to the following corollary, where we no longer explicitly state the constants depending on the dimension.
Corollary 2.2
Let \(m\ge 1\) and \(\mathbf {X}\) and \(\mathbf {Y}\) be m-dimensional random variables. Assume that \(F_\mathbf {Y}\) is differentiable and let
Then
where the O-constants only depend on the dimension m.
Existence of \(\mathbb {E}(\mathbf {X})\) and \(\mathbb {E}(\mathbf {Y})\) is sufficient for the finiteness of the integrals in (2.2).
In order to explain the choice of the operator \(\Lambda \), we first state it in dimension 2:
This coincides with Sadikova’s definition. This also shows that our operator is non-linear as, e.g., \(\Lambda (s_{1}+s_{2})(s_{1},s_{2})\ne \Lambda (s_{1})(s_{1},s_{2})+\Lambda (s_{2})(s_{1},s_{2})\).
In Theorem 2, we apply \(\Lambda \) to characteristic functions; so we may restrict our attention to functions h with \(h(\varvec{0})=1\). From (2.3), we see that \(\Lambda (h)(s_1, 0) = \Lambda (h)(0, s_2)=0\), so that \(\Lambda (h)(s_1, s_2)/(s_1s_2)\) is bounded around the origin. This is essential for the boundedness of the integral in Theorem 2. In general, this property will be guaranteed by our particular choice of coefficients. The coefficients \(\mu _\alpha \) for \(\alpha \in \Pi _L\) are chosen as the values \(\mu (\alpha , \{L\})\) of the Möbius function of the lattice of partitions, which reflects the underlying combinatorial structure (as will be explained): Weisner’s theorem (see Stanley [25, Corollary 3.9.3]) is crucial in the proof that \(\Lambda (h)(\mathbf {s})/(s_1\ldots s_m)\) is bounded around the origin (see the proof of Lemma 3.1).
The second property is that our proof of the quasi-power theorem needs estimates for the tails of the integral in Theorem 2. These estimates have to be exponentially small in every variable, which means that every variable has to occur in every summand. This is trivially fulfilled in our definition as every summand in the definition of \(\Lambda \) is formulated in terms of a partition. In contrast, Gamkrelidze [8] (and also Prakasa Rao [21]) use a linear operator L mapping h to
When taking the difference of two characteristic functions, we may assume that \(h(0, 0)=0\) so that the first crucial property as defined above still holds. However, the tails are no longer exponentially small in every variable: the last summand \(h(0,s_{2})\) in (2.4) is not exponentially small in \(s_{1}\) because it is independent of \(s_{1}\) and nonzero in general. However, the first two summands are exponentially small in \(s_{1}\) by our assumption (1.2).
For that reason, using the Berry–Esseen inequality by Gamkrelidze [8] to prove a quasi-power theorem leads to a weaker error term \(O(\phi _{n}^{-1/2}\log ^{m-1}\phi _n)\) in (1.3). It can be shown that the less precise error term necessarily appears when using Gamkrelidze’s result by considering the example of \(\varvec{\Omega }_n\) being the 2-dimensional vector consisting of a normal distribution with mean \(-1\) and variance n and a normal distribution with mean 0 and variance n. This is a consequence of the linearity of the operator L in Gamkrelidze’s result.
3 Combinatorial background of the operator \(\Lambda \)
Before we start with the proof of Theorem 2, we state and prove the property of the operator \(\Lambda \) which motivates its Definition 2.1.
Lemma 3.1
Let \(K\subsetneq L\) and \(h:\mathbb {C}^L\rightarrow \mathbb {C}\) with \(h(\varvec{0})=1\). Then
Before actually proving the lemma, we recall some of the theory about the Möbius function of a partially ordered set (poset), see also Stanley [25, Section 3.7].
By the following definition, \(\Pi _L\), the set of all partitions of L, is a poset: As usual, a partition \(\alpha \in \Pi _L\) is said to be a refinement of a partition \(\alpha '\in \Pi _L\) if
In this case, we write \(\alpha \le \alpha '\). This defines a partial order on \(\Pi _L\).
The Möbius function on \(\Pi _L\) is denoted by \(\mu \) and is recursively defined as follows: for \(\alpha < \alpha '\), we set \(\mu (\alpha ', \alpha ')=1\) and
For \(\alpha \), \(\alpha '\in \Pi _L\), the infimum \(\alpha \wedge \alpha '\) of \(\alpha \) and \(\alpha '\) is given by
In fact, \(\Pi _L\) is a lattice (cf. Stanley [25, Example 3.10.4]). The greatest element is \(\{L\}\).
For \(\alpha \in \Pi _L\), we have
where \(|\alpha |\) denotes the number of parts of the partition, see Stanley [25, (3.37)]. In particular, we may rewrite the definition of \(\Lambda \) (Definition 2.1) as
For any \(\gamma \), \(\beta \in \Pi _L\) with \(\gamma \le \beta < \{L\}\), Weisner’s theorem (see Stanley [25, Corollary 3.9.3]) applied to the interval \([\gamma , \{L\}]\) asserts that
We now turn to the actual proof of the lemma.
Proof of Lemma 3.1
Consider the partition \(\beta =\{K\} \cup \{\{k\}:k\in L\setminus K\}\) of L, i.e., \(\beta \) consists of K as one part and a collection of singletons. As \(K\ne L\), we have \(\beta < \{L\}\).
By definition of \(\psi \), we have \(\psi _{J,L}\circ \psi _{K,L}=\psi _{J\cap K, L}\) for J, \(K\subseteq L\). If \(\alpha \in \Pi _L\), then
because parts \(J\in \alpha \) with \(J\cap K=\emptyset \) contribute \(h(\varvec{0})=1\). Therefore, collecting the sum (3.1) according to \(\alpha \wedge \beta \) yields
As \(\gamma \le \beta <\{L\}\), the inner sum vanishes by (3.2). \(\square \)
4 Proof of the Berry–Esseen inequality
This section is devoted to the proof of the Berry–Esseen inequality, Theorem 2. It is a generalisation of Sadikova’s proof.
We start with an auxiliary one-dimensional random variable.
Lemma 4.1
Let P be the one-dimensional random variable with probability density function
Then its characteristic function is
and
Let \(\lambda \) be the unique positive number such that
Then
Proof
The characteristic function (4.1) is mentioned in [9, Section 39]; it is computed by standard methods.
Differentiating \(\varphi _P\) twice, we see that the second moment is 12. To prove (4.2), we rewrite \(\mathbb {E}(|P |)\) as
We use the estimates \(\sin z\le z\) and \(|\sin z |\le 1\) on the intervals [0, 1] and \([1, \infty )\), respectively. Thus
To obtain a bound for \(\lambda \), we follow Gamkrelidze [8]: we estimate the tail using \(|\sin ^4(z) |\le 1\) and get
This results in (4.3). \(\square \)
In the next step, we consider tuples of random variables distributed as P. They will be used to ensure smoothness. We write \(\varvec{1}\) to denote a vector with all coordinates equal to 1.
Lemma 4.2
Let \(\mathbf {Q}=(P_1/T, \ldots , P_m/T)\) be the m-dimensional random variable where the \(P_j\) are independent random variables with the same distribution as P in Lemma 4.1 and T is the fixed constant defined in Theorem 2.
Then \(\mathbf {Q}\) has density function and characteristic function
respectively. The characteristic function vanishes outside \([-T, T]^m\).
Furthermore,
hold for \(\theta \in \{\pm 1\}\) and \(j\in \{1,\ldots , m\}\).
Proof
Because of independence, the distribution function and the characteristic function of \(\mathbf {Q}\) are the product of the distribution functions and the characteristic functions of the \(P_j/T\), respectively. Division by T transforms the density and characteristic functions as claimed. As \(\varphi _P(t)\) vanishes outside \([-1, 1]\) by (4.1), \(\varphi _\mathbf {Q}(\mathbf {t})\) vanishes outside \([-T, T]^m\).
By a simple translation, the integral on the left hand side of (4.4) can be seen to be equal to
Then (4.4) is a simple consequence of \(Q_j=P_j/T\), (4.2) and the triangle inequality.
By the same translation and the definition of \(\lambda \), the integral on the left hand side of (4.5) is
\(\square \)
From now on, we let \(\mathbf {Q}\) be as in Lemma 4.2 and let \(\mathbf {Q}\) be independent of \(\mathbf {X}\) and \(\mathbf {Y}\). We first prove an inequality relating the difference between the distribution functions of \(\mathbf {X}\) and \(\mathbf {Y}\) to that of the distribution functions of \(\mathbf {X}+\mathbf {Q}\) and \(\mathbf {Y}+\mathbf {Q}\).
Lemma 4.3
We have
Proof
Let
and \(\varepsilon >0\). We choose \(\theta \in \{\pm 1\}\) such that \(S=\sup _{z\in \mathbb {R}^m}\theta (F_{\mathbf {X}}(\mathbf {z})-F_{\mathbf {Y}}(\mathbf {z}))\).
There is a \(\mathbf {z}_{\varepsilon }\in \mathbb {R}^m\) such that
Let \(\mathbf {w}\in \mathbb {R}^n\) with \(\theta \mathbf {w}\le \varvec{0}\). By monotonicity of \(F_{\mathbf {X}}\), we have \(\theta F_{\mathbf {X}}(\mathbf {z}_\varepsilon -\mathbf {w})\ge \theta F_{\mathbf {X}}(\mathbf {z}_\varepsilon )\). Thus
We multiply this inequality by \(f_{\mathbf {Q}}\left( \mathbf {w}+\frac{\theta \lambda }{T}\varvec{1}\right) \) and integrate over all \(\mathbf {w}\in \mathbb {R}^n\) with \(\theta \mathbf {w}\le \varvec{0}\). By (4.5) and (4.4), we get
Setting
and using the estimate \(|\theta (F_{\mathbf {X}}-F_{\mathbf {Y}})(\mathbf {z}_\varepsilon -\mathbf {w}) |\le S\) yields
by (4.5) and the fact that \(f_{\mathbf {Q}}\) is a probability density function.
Combining (4.7) and (4.8) yields
As the sum of random variables corresponds to a convolution, we have
Replacing \(\mathbf {z}\) and \(\mathbf {w}\) by \(\mathbf {z}_{\varepsilon }+\frac{\theta \lambda }{T}\varvec{1}\) and \(\mathbf {w}+\frac{\theta \lambda }{T}\varvec{1}\), respectively, and using (4.9) leads to
for all \(\varepsilon >0\). Taking the limit for \(\varepsilon \rightarrow 0\) and rearranging yields the right hand side of (4.6).
The left hand side of (4.6) is an immediate consequence of (4.10). \(\square \)
We are now able to bound the difference of the distribution functions by their characteristic functions.
Lemma 4.4
We have
Proof
Let \(\mathbf {a}\), \(\mathbf {z}\in \mathbb {R}^m\) with \(\mathbf {a}\le \mathbf {z}\).
The random variable \(\mathbf {X}_J+\mathbf {Q}_J\) admits a density function, because \(\mathbf {Q}_J\) admits a density function. In particular, \(\mathbf {X}_J+\mathbf {Q}_J\) is a continuous random variable. By Lévy’s theorem (see, e.g., [26, Thm. 1.8.4]),
As \(\varphi _{\mathbf {X}_J+\mathbf {Q}_J}(\mathbf {t}_J)=\varphi _{\mathbf {X}_J}(\mathbf {t}_J)\varphi _{\mathbf {Q}_J}(\mathbf {t}_J)\) and \(\varphi _{\mathbf {Q}_J}(\mathbf {t}_J)\) vanishes outside \([-T, T]^J\) by Lemma 4.2, we can replace the limit \(T_j\rightarrow \infty \) by setting \(T_j=T\), i.e.,
Taking the product over all \(J\in \alpha \) and summing over \(\alpha \in \Pi _L\) yields
where Fubini’s theorem and the fact that \(\varphi _\mathbf {Q}(\mathbf {t})=\prod _{J\in \alpha }\varphi _{\mathbf {Q}_J}(\mathbf {t}_J)\) have been used. By definition of \(\varphi _{\mathbf {X}}\), we have \(\varphi _{\mathbf {X}_J}(\mathbf {t}_J)=\varphi _{\mathbf {X}}(\psi _{J, L}(\mathbf {t}))\). Therefore, we can use the definition of \(\Lambda (\varphi _\mathbf {X})\) to rewrite (4.12) to
This equation remains valid when replacing \(\mathbf {X}\) by \(\mathbf {Y}\); taking the difference results in
If the integral on the right hand side of (4.11) is infinite, there is nothing to show. Thus we may assume that it is finite. This also implies that
is an integrable function on \(\mathbb {R}^m\) (as it vanishes outside \([-T, T]^m\)). Then by the Riemann–Lebesgue lemma, we may take the limit \(a_\ell \rightarrow -\infty \) for all \(\ell \in L\) in (4.13) to obtain
Taking absolute values and rewriting the left hand side in terms of marginal distribution functions yields (4.11). \(\square \)
We now bound the contribution of the lower dimensional distributions.
Lemma 4.5
We have
Proof
Let \(\alpha =\{J_1,\ldots , J_r\}\in \Pi _L\). Then
because the products over the distribution functions are bounded by 1.
Therefore,
A partition \(\alpha \in \Pi _L\) with \(J\in \alpha \) can be uniquely written as \(\alpha =\{J\}\cup \beta \) for a \(\beta \in \Pi _{L\setminus J}\). Thus
because there are partitions of \(L\setminus J\) with k parts. Using the left hand side of (4.6) yields the assertion (more precisely, applying a version of the left hand side of (4.6) for marginal distributions). \(\square \)
Now, we can complete the proof of the theorem.
Proof of Theorem 2
The estimate (2.1) follows from Lemma 4.3 [more precisely, the right hand side of (4.6)], Lemma 4.4 and Lemma 4.5.
If the expectation of \(\mathbf {X}\) exists, \(\varphi _{\mathbf {X}}\) is differentiable. Therefore, \(\Lambda (\varphi _\mathbf {X})\) is differentiable, too. By Lemma 3.1, \(\Lambda (\varphi _\mathbf {X})(\mathbf {t})\) has a zero whenever one of the \(t_\ell \), \(\ell \in L\), vanishes. Thus
is bounded around \(\varvec{0}\) and therefore bounded on \([-T, T]^m\). The same holds for \(\mathbf {Y}\). Thus the integral on the right hand side of (2.1) converges. \(\square \)
5 Proof of the quasi-power theorem
We may now prove the m-dimensional quasi-power theorem, Theorem 1.
Let \(\varvec{\mu }_n=\phi _{n}{{\mathrm{grad}}}u(\varvec{0}) \) and \(\Sigma =H_u(\varvec{0})\). We define the random vector \(\mathbf {X}=\phi _{n}^{-1/2}(\varvec{\Omega }_n-\varvec{\mu }_n)\). For simplicity, we ignore the dependence on n in this and the following notations.
First, we establish bounds for the characteristic function of \(\mathbf {X}\).
Lemma 5.1
For \(\Sigma \) regular or singular, there exists a function \(V(\mathbf {s})\) which is analytic for \(||\mathbf {s} ||< \tau \sqrt{\phi _{n}}/2\) such that
and
hold for all \(\mathbf {s}\in \mathbb {C}^K\) with \(||\mathbf {s} ||< \tau \sqrt{\phi _{n}}/2\).
For \(n\rightarrow \infty \), \(\mathbf {X}\) converges in distribution to a normal distribution with mean \(\varvec{0}\) and variance-covariance matrix \(\Sigma \). In particular, \(\Sigma \) is positive (semi-)definite if it is regular (singular, respectively).
Proof
By replacing \(u(\mathbf {s})\) and \(v(\mathbf {s})\) by \(u(\mathbf {s})-u(\varvec{0})\) and \(v(\mathbf {s})-v(\varvec{0})\), respectively, we may assume that \(u(\varvec{0})=v(\varvec{0})=0\). We define \(E_n(\mathbf {s})\) by the relation \(M_n(\mathbf {s})=e^{W_n(\mathbf {s})}(1+E_n(\mathbf {s}))\) and note that by assumption, \(E_n(\mathbf {s})=O(\kappa _n^{-1})\) uniformly for \(\Vert \mathbf {s}\Vert \le \tau \). We note that this implies \(E_n(\varvec{0})=0\).
By assumption, \(M_n(\mathbf {s})\) exists for \(||\mathbf {s} ||\le \tau \). Therefore, it is continuous for these \(\mathbf {s}\) and, by Morera’s theorem combined with applications of Fubini’s and Cauchy’s theorems, \(M_n(\mathbf {s})\) is analytic for \(||\mathbf {s} ||\le \tau \). This also implies that \(E_n(\mathbf {s})\) is analytic for \(||\mathbf {s} ||\le \tau \). By Cauchy’s formula, we have
for \(||\mathbf {s} ||<\tau /2\). Thus
for \(||s ||<\tau /2\) (where \([\varvec{0}, \mathbf {s}]\) denotes the line segment from \(\varvec{0}\) to \(\mathbf {s}\)).
We calculate that
with
Since \(u(\varvec{0})=v(\varvec{0})=0\) and the first and second order terms of u cancel out, we have
for \(\Vert \mathbf {s}\Vert <\tau \sqrt{\phi _{n}}/2\).
Note that
for \(\mathbf {s}\in \mathbb {C}^m\), which implies that, in distribution, \(\mathbf {X}\) converges to the normal distribution with mean zero and variance-covariance matrix \(\Sigma \). Although we have to refine our estimates for applying Theorem 2, we immediately conclude that \(\Sigma \) is positive (semi-) definite depending on whether it is regular or not. \(\square \)
Let now \(\Sigma \) be regular. By \(\mathbf {Y}\) we denote a normally distributed random variable in \(\mathbb {R}^m\) with mean \(\varvec{0}\) and variance-covariance matrix \(\Sigma \). Its characteristic function is
The smallest eigenvalue of \(\Sigma \) is denoted by \(\sigma >0\).
We are now able to bound the functions occurring in the Berry–Esseen inequality.
Lemma 5.2
There exists a \(c<\tau /2\) such that
holds for all \(\mathbf {s}\in \mathbb {C}^L\) with \(||\mathbf {s} ||\le c\sqrt{\phi _{n}}\) and \(||\mathfrak {I}\mathbf {s} ||\le 1\).
Proof
Let \(\alpha \in \Pi _L\). Then by Lemma 5.1, we have
For \(\mathbf {t}\in \mathbb {R}^L\), we have \(\mathbf {t}^\top \Sigma \mathbf {t}\ge \sigma \mathbf {t}^\top \mathbf {t}\ge \sigma ||\mathbf {t} ||^2\). For complex w, we have \(|\exp (w)-1|\le |w|\exp (|w|)\). Splitting \(\mathbf {s}\) into its real and imaginary parts in the first summand and using these inequalities for the first and second factor of (5.2), respectively, yields
by (5.1). For sufficiently small c, we obtain
Multiplication by \(|\mu _\alpha |\) and summation over all \(\alpha \in \Pi _L\) conclude the proof of the lemma. \(\square \)
The last ingredient to prove the quasi-power theorem is a bound of the integrals occurring in the Berry–Esseen inequality.
Lemma 5.3
Let c be as in Lemma 5.2. Then
Proof
For simplicity, set \(h=\Lambda (\varphi _{\mathbf {X}})-\Lambda (\varphi _\mathbf {Y})\). For a partition \(\{J,K\}\) of L, set
and partition \(\mathbf {s}\) into \((\mathbf {s}_{J},\mathbf {s}_{K})\). We use the notation
when \(J=\{j_1,\ldots , j_{|J |}\}\). The product of the paths from 0 to \(s_j\) for \(j\in J\) is denoted by \([\varvec{0}, \mathbf {s}_{J}]\).
By Lemma 3.1, we have
By Cauchy’s integral formula, we have
where \(\zeta _j\) is integrated over the circle of radius 1 around \(z_j\) for \(j\in J\), thus \(||\mathfrak {I}\varvec{\zeta }_{J} ||\le 1\).
Using the estimate of Lemma 5.2 yields
Combining (5.3), (5.4) and (5.5) leads to
The inner integral results in \(|\prod _{j\in J}s_j |\). The factors \(|s_k |\ge 1\) for \(k\in K\) in the denominator can simply be omitted. If \(K\ne \emptyset \), we still have to bound
where the integration bounds are meant coordinate-wise. Then we use the fact that
is finite for all constants \(t\ge 0\). Thus, after completing the square in the argument of the exponential function, the integral over \(s_k\) is bounded by a constant, i.e.,
We conclude that
Summation over all partitions \(\{J,K\}\) of L completes the proof of the lemma. \(\square \)
We now collect all results to prove Theorem 1.
Proof of Theorem 1
We set \(T=c\sqrt{\phi _{n}}\) with c from Lemma 5.2. By Theorem 2 and Lemma 5.3, we have
For \(\emptyset \ne J\subsetneq L\), we have \(\varphi _{\mathbf {X}_J}=\varphi _{\mathbf {X}}\circ \chi _{J, L}\). Therefore, all prerequisites for applying the quasi-power theorem on \((\varvec{\Omega }_n)_J\) are fulfilled. Therefore, we can apply (5.6) recursively and finally obtain
\(\square \)
Note that it would also have been possible to apply Corollary 2.2; however, this would have required proving Lemmas 5.2 and 5.3 for subsets K of L, which would have required some notational overhead using \(\chi _{K, L}\).
Proof of Proposition 1.1
This follows by the same arguments as in [17, Thm. 2]. \(\square \)
References
Bender, E.A., Richmond, L.B.: Central and local limit theorems applied to asymptotic enumeration II: multivariate generating functions. J. Combin. Theory Ser. A 34, 255–265 (1983)
Berry, A.C.: The accuracy of the Gaussian approximation to the sum of independent variates. Trans. Am. Math. Soc. 49, 122–136 (1941)
Drmota, M.: Random Trees. Springer, New York (2009)
Eagle, C., Gao, Z., Omar, M., Panario, D., Richmond, B.: Distribution of the number of encryptions in revocation schemes for stateless receivers. In: Fifth Colloquium on Mathematics and Computer Science, Discrete Mathematics and Theoretical Computer Science Proceedings, AI, pp. 195–206 (2008)
Esseen, C.-G.: Fourier analysis of distribution functions. A mathematical study of the Laplace-Gaussian law. Acta Math. 77, 1–125 (1945)
Flajolet, P., Sedgewick, R.: Analytic Combinatorics. Cambridge University Press, Cambridge (2009)
Gamkrelidze, N.G.: A multidimensional generalization of Esseen’s inequality for distribution functions. Teor. Verojatnost. i Primenen. 22(4), 897–900 (1977)
Gamkrelidze, N.G.: A multidimensional generalization of Esseen’s inequality for distribution functions. Theory Probab. Appl. 22, 877–880 (1977). English Translation of the paper in Teor. Verojatnost. i Primenen
Gnedenko, B.V., Kolmogorov, A.N.: Limit Distributions for Sums of Independent Random Variables. Addison-Wesley Publishing Company Inc, Cambridge (1954). Translated and annotated by K. L. Chung. With an Appendix by J. L. Doob
Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics. A Foundation for Computer Science, 2nd edn. Addison-Wesley, Cambridge (1994)
Gut, A.: Probability: A Graduate Course. Springer Texts in Statistics. Springer, New York (2005)
Heuberger, C.: Hwang’s quasi-power-theorem in dimension two. Quaest. Math. 30, 507–512 (2007)
Heuberger, C., Kropf, S.: On the higher dimensional quasi-power theorem and a Berry–Esseen inequality. In: Proceedings of the 27th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (2016). arXiv:1602.04055 [math.PR]
Heuberger, C., Kropf, S., Wagner, S.: Variances and covariances in the central limit theorem for the output of a transducer. Eur. J. Combin. 49, 167–187 (2015)
Heuberger, C., Prodinger, H.: Analysis of alternative digit sets for nonadjacent representations. Monatsh. Math. 147, 219–248 (2006)
Heuberger, C., Prodinger, H.: The Hamming weight of the non-adjacent-form under various input statistics. Period. Math. Hungar. 55, 81–96 (2007)
Hwang, H.-K.: On convergence rates in the central limit theorems for combinatorial structures. Eur. J. Combin. 19, 329–343 (1998)
Kropf, S.: Variance and covariance of several simultaneous outputs of a Markov chain. Discrete Math. Theor. Comput. Sci. 18(3) (2016)
The On-Line Encyclopedia of Integer Sequences. http://oeis.org (2015)
Petrov, V.V.: Classical-type limit theorems for sums of independent random variables. In: Prokhorov, Y.V., Statulevicius, V. (eds.) Limit Theorems of Probability Theory. Springer, Berlin (2000)
Prakasa Rao, B.L.S.: Another Esseen-type inequality for multivariate probability density functions. Stat. Probab. Lett. 60(2), 191–199 (2002)
Roussas, G.G.: An Esseen-type inequality for probability density functions, with an application. Stat. Probab. Lett. 51(4), 397–408 (2001)
Sadikova, S.M.: On two-dimensional analogs of an inequality of Esseen and their application to the central limit theorem. Teor. Verojatnost. i Primenen. 11, 369–380 (1966)
Sadikova, S.M.: On two-dimensional analogues of an inequality of Esseen and their application to the central limit theorem. Theory Probab. Appl. XI, 325–335 (1966). English Translation of the paper in Teor. Verojatnost. i Primenen
Stanley, R.P.: Enumerative Combinatorics. Volume 1. Cambridge Studies in Advanced Mathematics, vol. 49, 2nd edn. Cambridge University Press, Cambridge (2012)
Ushakov, N.G.: Selected Topics in Characteristic Functions. Modern Probability and Statistics, vol. 4. VSP, Utrecht (1999)
Acknowledgements
Open access funding provided by University of Klagenfurt.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by A. Constantin.
The authors are supported by the Austrian Science Fund (FWF): P 24644-N26.
This is the full version of the extended abstract [13].
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Heuberger, C., Kropf, S. Higher dimensional quasi-power theorem and Berry–Esseen inequality. Monatsh Math 187, 293–314 (2018). https://doi.org/10.1007/s00605-018-1215-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00605-018-1215-6