The topic of this paper is the explicit approximation, in various metrics, of random variables which, in terms of characteristic functions, behave like a sum

$$\begin{aligned} X_n=Z_n+Y_n \end{aligned}$$

of a “model” variable \(Z_n\) (for instance, a Poisson random variable) and an independent perturbation \(Y_n\), when the model variable has “large” parameter. Our interest is in discrete random variables, and in cases where this simple-minded decomposition does not in fact exist. We have two motivations:

(1) In probabilistic number theory, it has been known since the proof by Rényi and Turán of the Erdős–Kac theorem that the random variable \(\omega (N_n)\) given by the number of prime divisors (without multiplicity, for definiteness) of an integer \(N_n\) uniformly chosen in the interval \(\{1,2,\ldots ,n\}\) has characteristic function given by

$$\begin{aligned} \mathbb{E }\{e^{i\theta \omega (N_n)}\} \ =\ \mathbb{E }\{e^{i\theta Z_n}\}\Phi (\theta )(1+o(1)) \end{aligned}$$

as \(n\rightarrow \infty \), where \(Z_n \sim \mathrm{Po\,}(\log \log n)\) is a Poisson variable with mean \(\log \log n\) and \(\Phi (\theta )\) is defined by

$$\begin{aligned} \Phi (\theta )=\frac{1}{\Gamma (e^{i\theta })} \prod _{p \text{ prime }}{ \left( 1+\frac{e^{i\theta }-1}{p}\right) \left( 1-\frac{1}{p}\right) ^{e^{i\theta }-1}, } \end{aligned}$$

the product being absolutely convergent for all \(\theta \) real. This \(\Phi (\theta )\) is not the characteristic function of a probability distribution, and hence formula (1.1) with \(Z_n \sim \mathrm{Po\,}(\log \log n)\) cannot be true. However, we are nonetheless able to obtain explicit approximation statements for the law of \(\omega (N_n)\):

Theorem 1.1

For every integer \(r\ge 0\), there exist explicitly computable signed measures \(\nu _{r,n}\) on the positive integers such that the total variation distance between the law of \(\omega (N_n)\) and \(\nu _{r,n}\) is of order \(O\{(\log \log n)^{-(r+1)/2}\}\) for \(n\ge 3\).

This is proved in Sect. 7.3, where formulas for the measures \(\nu _{1,n}\) and \(\nu _{2,n}\) are also given. Such results are new in analytic number theory, where total variation distance estimates have hardly been considered before [but see [4] for a result concerning the total variation distance to a Poisson approximation for the distribution of a truncated version of \(\omega (N_n)\)].

For more on the significance of the Rényi–Turán formula, comparison with the Keating–Snaith conjectures for the Riemann zeta function, and finite-field analogues, see Kowalski and Nikeghbali [6].

(2) In a beautiful paper, Hwang [5] considered sequences of non-negative integer valued random variables \(X_n\), whose probability generating functions \(f_{X_n}\) satisfy

$$\begin{aligned} e^{\lambda _n(1-z)}f_{X_n}(z) \ \rightarrow \ g(z), \end{aligned}$$

for all \(z\in \mathbb{C }\) with \(|z| \le \eta \), for some \(\eta > 1\), where the function \(g\) is analytic, and \(\lim _{n\rightarrow \infty }\lambda _n = \infty \). This assumption is also intuitively related to a model (1.1). Under some extra conditions, Hwang exhibits bounds of order \(O(\lambda _n^{-1})\) on the accuracy of the approximation of the distribution of \(X_n\) by a Poisson distribution with carefully chosen mean, close to \(\lambda _n\). Hwang [5] also notes that his methods can be applied to families of distributions other than the Poisson family, and gives examples using the Bessel family.

In this paper, we systematically consider sequences of integer valued random variables \(X_n\), whose characteristic functions \(\phi _{X_n}\) satisfy a condition which, in the Poisson context, is some strengthening of the convergence

$$\begin{aligned} \exp \{\lambda _n(1-e^{i\theta })\}\phi _{X_n}(\theta ) \rightarrow \psi (\theta ), \quad 0 < |\theta | \le \pi . \end{aligned}$$

Under suitable conditions, we derive explicit approximations to the distribution of \(X_n\), in various metrics, by measures related to the Poisson model. The approximations can be made close to any given polynomial order in \(\lambda _n^{-1/2}\), if the conditions are sharp enough and the measure is correspondingly chosen. The conditions that we require for these expansions are much weaker than those of Hwang [5]. For instance, his conditions require the \(X_n\) to take only non-negative values, and to have exponential tails, neither of which conditions we need to impose.

Our basic result, Proposition 2.1, is very simple and explicit. It enables us to dispense with asymptotic settings, and to prove concrete error bounds. It also allows us to consider approximation by quite general families of distributions on the integers, instead of just the Poisson family, requiring only the replacement of the Poisson characteristic function in (1.2) by the characteristic function corresponding to the family chosen. This enables us to deduce expansions based on any such discrete family of distributions, as shown in Sect. 4, without any extra effort. Indeed, the main problem would seem to be to identify the higher order terms in the expansions, but these turn out simply to be linear combinations of the higher order differences of the basic distribution: see (2.6).

This elementary result, and a simple but powerful theorem that follows from it, are given, together with an example, in Sect. 2. The conditions are then substantially relaxed, in order to allow for wider application, and to treat total variation approximation in a satisfactory manner. The general conclusions are proved in the context of approximating finite signed measures in Sect. 3, and they are reformulated for approximating probability distributions in the usual asymptotic framework in Sect. 4.

In the Poisson context, the measures that result are the Poisson–Charlier measures. Our general results enable us to deduce a Poisson–Charlier approximation with error of order \(O(\lambda _n^{-t/2})\), for any prescribed \(t\), assuming that Hwang’s conditions hold. We also show that the Poisson–Charlier expansions are valid under more general conditions, in which the \(X_n\) may have only a few finite moments. These expansions are established in Sect. 5, and the compound Poisson context is briefly discussed in Sect. 6. We discuss some examples, to sums of independent integer valued random variables, to Hwang’s setting and to our first motivation, proving Theorem 1.1, in Sect. 7.

In order to ease the reading of this paper, we give here a diagram indicating the logical dependency of the results we prove. On the left-hand side are the basic approximation theorems, the right-hand side represents applications, and the results of Sect. 4 represent the bridge linking the two:

figure a

We frame our approximations in terms of three distances between (signed) measures \(\mu \) and \(\nu \) on the integers: the point metric

$$\begin{aligned} d_{\mathrm{loc}}(\mu ,\nu ) \ :=\ \sup _{j\in \mathbb{Z }}|\mu \{j\} - \nu \{j\}|, \end{aligned}$$

the Kolmogorov distance

$$\begin{aligned} d_{\mathrm{K}}(\mu ,\nu ) \ :=\ \sup _{j\in \mathbb{Z }}|\mu \{(-\infty ,j]\} - \nu \{(-\infty ,j]\}|, \end{aligned}$$

and the total variation norm

$$\begin{aligned} \Vert \mu -\nu \Vert \ :=\ \sum _{j\in \mathbb{Z }}|\mu \{j\} - \nu \{j\}|. \end{aligned}$$

Other metrics could also be treated using our methods.

The basic estimate

The essence of our argument is the following elementary result, linking the closeness of finite signed measures \(\mu \) and \(\nu \) to the closeness of their characteristic functions, when these have a common factor involving a ‘large’ parameter \(\rho \); for a finite signed measure \(\zeta \) on \(\mathbb{Z }\), the characteristic function \(\phi _\zeta \) is defined by \(\phi _\zeta (\theta ) := \sum _{j\in \mathbb{Z }} e^{ij\theta }\zeta \{j\}\), for \(|\theta |\le \pi \).

Proposition 2.1

Let \(\mu \) and \(\nu \) be finite signed measures on \(\mathbb{Z }\), with characteristic functions \(\phi _\mu \) and \(\phi _\nu \) respectively. Suppose that \(\phi _\mu = \psi _\mu \chi \) and \(\phi _\nu = \psi _\nu \chi \), and write \(d_{\mu \nu }:= \psi _\mu - \psi _\nu \). Suppose that, for some \( \gamma ,\rho ,t>0\),

$$\begin{aligned} |d_{\mu \nu }(\theta )| \ \le \ \gamma |\theta |^t \quad \text{ and }\quad |\chi (\theta )| \le e^{-\rho \theta ^2} \quad \text{ for } \text{ all }\quad |\theta |\le \pi . \end{aligned}$$

Then there are explicit constants \(\alpha _{1t}\) and \(\alpha _{2t}\) such that

  1. 1.

    \(\sup _{j\in \mathbb{Z }}|\mu \{j\} - \nu \{j\}| \ \le \ \alpha _{1t} \gamma (\rho \vee 1)^{-(t+1)/2};\)

  2. 2.

    \(\sup _{a \le b\in \mathbb{Z }}|\mu \{[a,b]\} - \nu \{[a,b]\}| \ \le \ \alpha _{2t} \gamma (\rho \vee 1)^{-t/2}\).


For any \(j\in \mathbb{Z }\), the Fourier inversion formula gives

$$\begin{aligned} \mu \{j\} - \nu \{j\} \ =\ \frac{1}{2\pi }\int _{-\pi }^\pi e^{-ij\theta }(\psi _\mu (\theta )-\psi _\nu (\theta ))\chi (\theta )\,d\theta , \end{aligned}$$

from which our assumptions imply directly that

$$\begin{aligned} |\mu \{j\} - \nu \{j\}| \le \frac{1}{2\pi }\int _{-\pi }^\pi \gamma |\theta |^t\exp \{-\rho \theta ^2\}\,d\theta . \end{aligned}$$

For \(\rho \le 1\), we thus have

$$\begin{aligned} |\mu \{j\} - \nu \{j\}| \le \frac{1}{2\pi }\int _{-\pi }^\pi \gamma |\theta |^t\,d\theta \ \le \ \frac{\pi ^{t} \gamma }{t+1} \ =:\ \beta _{1t} \gamma . \end{aligned}$$

For \(\rho \ge 1\), it is immediate that

$$\begin{aligned} |\mu \{j\} - \nu \{j\}| \le \frac{ \gamma }{2\pi }\Bigl (\frac{1}{\sqrt{2\rho }}\Bigr )^{t+1} \int _{-\infty }^\infty |y|^t e^{-y^2/2}\,dy \ \le \ \beta ^{\prime }_{1t} \gamma \rho ^{-(t+1)/2}, \end{aligned}$$

with \(\beta ^{\prime }_{1t} := 2^{-(t+1)/2}m_{t}/\sqrt{2\pi }\); here, \(m_t\) denotes the \(t\)-th absolute moment of the standard normal distribution. Setting

$$\begin{aligned} \alpha _{1t} \ :=\ \max \{\beta _{1t},\beta ^{\prime }_{1t}\} \ =\ \max \left\{ 2^{-(t+1)/2}m_{t}/\sqrt{2\pi },\,\pi ^{t}/(t+1)\right\} , \end{aligned}$$

this proves part 1. The second part is similar, adding (2.2) over \(a\le j\le b\), and estimating

$$\begin{aligned} \frac{\left| e^{-ia\theta } - e^{-i(b+1)\theta }\right| }{|1 - e^{-i\theta }|} \ \le \ \frac{\pi }{|\theta |}, \quad |\theta | \le \pi . \end{aligned}$$

This gives part 2, with

$$\begin{aligned} \alpha _{2t} \ :=\ \max \{2^{-t/2}m_{t-1}\sqrt{\pi /2},\, \pi ^{t}/t\}. \end{aligned}$$

\(\square \)

We shall principally be concerned with taking \(\mu \) to be the distribution of a random variable \(X\). We allow \(\nu \) to be a signed measure, because in many cases, such as in the following canonical example and in the Poisson–Charlier expansions of Sect. 5, signed measures appear as the natural approximations.

Let \(X\) be an integer valued random variable with characteristic function \(\phi _X := \psi \chi \), where \(\chi \) is the characteristic function of a (well-known) probability distribution \(R\) on \(\mathbb{Z }\). Suppose that \(\chi \) satisfies

$$\begin{aligned} |\chi (\theta )| \le e^{-\rho \theta ^2}, \end{aligned}$$

as for Proposition 2.1, and that \(\psi \) can be approximated by a polynomial expansion around \(\theta =0\) of the form

$$\begin{aligned} {\tilde{\psi }}_r(\theta ) \ :=\ \sum _{l=0}^r\tilde{a}_l(e^{i\theta }-1)^l, \end{aligned}$$

for real coefficients \(\tilde{a}_l\) (and with \(\tilde{a}_0=1\)) and some \(r\in \mathbb{N }_0\), in that

$$\begin{aligned} |\psi (\theta ) - {\tilde{\psi }}_r(\theta )| \ \le \ K_{r\delta } |\theta |^{r+\delta },\quad |\theta | \le \pi , \end{aligned}$$

for some \(0<\delta \le 1\). In view of Proposition 2.1, this suggests that the distribution of \(X\) may be well approximated by the signed measure \(\nu _r = \nu _r(R;\tilde{a}_1,\ldots ,\tilde{a}_r)\) having \({\tilde{\psi }}_r\chi \) as characteristic function. Now \(\nu _r\) can immediately be identified as

$$\begin{aligned} \nu _r \ =\ \sum _{l=0}^r(-1)^l\tilde{a}_l D^l R, \end{aligned}$$

where the differences \(D^l R\) of the probability measure \(R\) are determined by iterating the relation \(DR\{j\} := R\{j\} - R\{j-1\}\). Hence, under these assumptions, Proposition 2.1 implies the following theorem; note that the assumption (2.5) is much like supposing that \(\psi \) has a Taylor expansion of length \(r\) around zero (in powers of \(i\theta \)), and hence that \(X\) has a corresponding number of finite moments.

Theorem 2.2

Let \(X\) be a random variable on \(\mathbb{Z }\) with distribution \(P_X\). Suppose that its characteristic function \(\phi _X\) is of the form \(\psi \chi \), where \(\chi \) is the characteristic function of a probability distribution \(R\) and satisfies (2.3) above. Suppose also that (2.5) is satisfied, for some \(r\in \mathbb{N }_0, \,\tilde{a}_1,\ldots ,\tilde{a}_r \in \mathbb{R }\) and \(\delta \ge 0\). Then, writing \(t=r+\delta \), we have

  1. 1.

    \(d_{\mathrm{loc}}(P_X,\nu _r) \ \le \ \alpha _{1t} K_{r\delta }(\rho \vee 1)^{-(t+1)/2}\);

  2. 2.

    \(d_K(P_X,\nu _r) \ \le \ \alpha _{2t} K_{r\delta } (\rho \vee 1)^{-t/2}\),

with \(\alpha _{1t}\) and \(\alpha _{2t}\) as in Proposition 2.1, and with \(\nu _r \ =\ \nu _r(R;\tilde{a}_1,\ldots ,\tilde{a}_r)\) as defined in (2.6).


Note that Proposition 2.1 can be applied with \(\psi _\mu = 0\), corresponding to \(\mu \) the zero measure, and \(\psi _\nu (\theta ) = \tilde{a}_l(e^{i\theta }-1)^l\), for any \(1\le l\le r\), showing that the contribution from the \(l\)-th term in the expansion to \(\nu _r\{j\}\) is at most \(|\tilde{a}_l|\alpha _{1l}(\rho \vee 1)^{-(l+1)/2}\), and that to \(\nu _r\{[a,b]\}\) at most \(|\tilde{a}_l|\alpha _{2l}(\rho \vee 1)^{-l/2}\). Thus, if \(\rho \) is large and the coefficients \(\tilde{a}_l\) moderate, the contributions decrease in powers of \(\rho ^{-1/2}\) as \(l\) increases. In such circumstances, the signed measure \(\nu _r\) can be seen as a perturbation of the underlying distribution \(R\).

The simplest application of the above results arises when \(\phi _X = \phi _Yp_{\lambda }\), where \(p_{\lambda }(\theta ) = e^{\lambda (e^{i\theta }-1)}\) is the characteristic function of the Poisson distribution \(\mathrm{Po\,}(\lambda )\) with mean \(\lambda \), which satisfies (2.3) with \(\rho = 2\pi ^{-2}\lambda \), and \(\phi _Y\) is the characteristic function associated with a random variable \(Y\) on the integers. In this case, \(X=Z+Y\) is the sum of two independent random variables, as in (1.1), with \(Z\sim \mathrm{Po\,}(\lambda )\), and the situation is probabilistically very clear. For \(w = w_\theta = e^{i\theta }-1\), we have \(\phi _Y(\theta ) = \mathbb{E }\{(1+w)^Y\}\). The latter expression has an expansion in powers of \(w\) up to the term in \(w^r\) if the \(r\)-th moment of \(Y\) exists, with coefficients \(\tilde{a}_k := F_k(Y)/k!, \,1\le k\le r\), where \(F_k(Y)\) denotes the \(k\)-th factorial moment of \(Y\):

$$\begin{aligned} F_k(Y) \ :=\ \sum _{l\ge k} \frac{l!}{(l-k)!}\,\mathbb{P }[Y=l] + \sum _{l\ge 1} (-1)^k \frac{(l+k-1)!}{(l-1)!}\,\mathbb{P }[Y=-l]. \end{aligned}$$

Thus the asymptotic expansion of \(X\) around \(\mathrm{Po\,}(\lambda )\) is simply derived from the factorial moments of the perturbing random variable \(Y\), if they exist.

For example, we could take \(\phi _Y\) to be the characteristic function of a random variable \(Y_s\) with distribution

$$\begin{aligned} \mathbb{P }[Y_s=-l] \ =\ s!\, \frac{s}{l(l+1)\dots (l+s)},\quad l\ge 1, \end{aligned}$$

for some integer \(s\ge 1\); the random variable has only \(s-1\) moments, and takes negative values, so that the theorems in [5] cannot be applied. However, \(Y_s\) has factorial moments

$$\begin{aligned} F_k(Y_s) \ =\ (-1)^k s!\,\sum _{l\ge 1}\frac{s}{(l+k)\ldots (l+s)} \ =\ (-1)^k k! \,\frac{s}{s-k},\quad 1\le k\le s-1, \end{aligned}$$

and characteristic function

$$\begin{aligned} \psi _{Y_s}(\theta ) \ =\ 1 + \sum _{k=1}^{s-1}(-1)^k\,\frac{s}{s-k}\,(e^{i\theta }-1)^k - s(1-e^{i\theta })^s \log (1-e^{-i\theta }), \end{aligned}$$

and (2.5) holds for \({\tilde{\psi }}_r\) as in (2.4), with \(r=s-1\) and any \(\delta < 1\), for \(\tilde{a}_k = F_k(Y)/k! = (-1)^k s/(s-k)\). Hence, if \(X = Z + Y_s\), where \(Z\sim \mathrm{Po\,}(\lambda )\) is independent of \(Y_s\), then Theorem 2.2 can be applied, approximating the distribution of \(X\) by the signed measure \(\nu _{s-1}(\mathrm{Po\,}(\lambda );\tilde{a}_1,\ldots ,\tilde{a}_{s-1})\).


Weaker conditions

Proposition 2.1 yields explicit bounds on \(d_{\mathrm{loc}}(\mu ,\nu )\) and \(d_{\mathrm{K}}(\mu ,\nu )\) in terms of the quantities specified in (2.1). However, for many applications, a slight weakening of its conditions is useful, in which Conditions (2.1) need not hold either exactly or for all \(\theta \), though with corresponding consequences for the bounds obtained. The bound assumed for the difference \(\psi _\mu (\theta )-\psi _\nu (\theta )\) in Proposition 2.1 is also replaced by a sum involving different powers of \(|\theta |\) in the following theorem. This would at first sight seem superfluous, but is nonetheless useful for asymptotics, when the coefficients of the powers may depend in different ways on the ‘large’ parameter \(\rho \).

We say that a characteristic function \(\chi \) is \((\rho ,\theta _0)\)-locally normal if

$$\begin{aligned} |\chi (\theta )| \le e^{-\rho \theta ^2}, \quad 0 \le |\theta |\le \theta _0, \end{aligned}$$

and that characteristic functions \(\phi _\mu \) and \(\phi _\nu \) are \((\varepsilon ,\eta ,\theta _0)\)-mod \(\chi \) polynomially close, for some \(\varepsilon ,\eta > 0\) and \(0 < \theta _0 \le \pi \), if \(\phi _\mu = \psi _\mu \chi \) and \(\phi _\nu = \psi _\nu \chi \), and that, for some \(M\ge 0\) and positive pairs \(\gamma _m,t_m, \,1\le m\le M\),

$$\begin{aligned} |\psi _\mu (\theta ) - \psi _\nu (\theta )|&\ \le \sum _{m=1}^M \gamma _m|\theta |^{t_m} + \varepsilon , \quad 0 \le |\theta |\le \theta _0;\end{aligned}$$
$$\begin{aligned} |\phi _\mu (\theta ) - \phi _\nu (\theta )|&\ \le \eta , \qquad \theta _0 < |\theta | \le \pi . \end{aligned}$$

Note that, for practical purposes, the quantities \(\varepsilon \) and \(\eta \) should be as small as possible. Using these definitions, we can state the following theorem, whose proof follows that of Proposition 2.1 very closely, and is omitted.

Theorem 3.1

Let \(\mu \) and \(\nu \) be finite signed measures on \(\mathbb{Z }\), with characteristic functions \(\phi _\mu \) and \(\phi _\nu \) respectively. Suppose that \(\chi \) is \((\rho ,\theta _0)\)-locally normal, and that \(\phi _\mu \) and \(\phi _\nu \) are \((\varepsilon ,\eta ,\theta _0)\)-mod \(\chi \) polynomially close. Then, with \(\alpha _{lt}\) as for Proposition 2.1, and for any \(a_0 < b_0 \in \mathbb{Z }\), we have

$$\begin{aligned}1.\sup _{j\in \mathbb{Z }}|\mu \{j\} - \nu \{j\}| \ \le \ \sum _{m=1}^M \gamma _m \alpha _{1t_m} (\rho \vee 1)^{-(t_m+1)/2} + \tilde{\alpha }_1 \varepsilon + \tilde{\alpha }_2 \eta ;\end{aligned}$$
$$\begin{aligned}&2.\sup _{a_0 \le a \le b \le b_0}|\mu \{[a,b]\} - \nu \{[a,b]\}|\\&\qquad \ \le \ \sum _{m=1}^M \gamma _m \alpha _{2t_m} (\rho \vee 1)^{-t_m/2} + (b_0-a_0+1)(\tilde{\alpha }_1 \varepsilon + \tilde{\alpha }_2 \eta ), \end{aligned}$$


$$\begin{aligned} \tilde{\alpha }_{1} \ :=\ \Bigl (\frac{\theta _0}{\pi } \wedge \frac{1}{2\sqrt{\pi \rho }} \Bigr ); \quad \tilde{\alpha }_2 \ :=\ \Bigl (1 - \frac{\theta _0}{\pi } \Bigr ), \end{aligned}$$

and \(\gamma _1,\ldots ,\gamma _M\) are as in (3.2).

The first conclusion yields a bound on \(d_{\mathrm{loc}}(\mu ,\nu )\). However, the presence of the factor \((b_0-a_0+1)\) in the second bound means that, in contrast to the situation in Proposition 2.1, a direct bound on \(d_K(\mu ,\nu )\) is not immediately visible. The following result, giving bounds on both \(d_K(\mu ,\nu )\) and \(\Vert \mu -\nu \Vert \), is however easily deduced; for a signed measure \(\mu , \,|\mu |\) as usual denotes its variation.

Theorem 3.2

With the notation and conditions of Theorem 3.1,

$$\begin{aligned} d_{\mathrm{K}}(\mu ,\nu )&\le \inf _{a\le b}\left( \varepsilon ^{\mathrm{(K)}}_{ab} + (|\mu | + |\nu |)\{[a,b]^c\}\right) \!;\\ \Vert \mu -\nu \Vert&\le \inf _{a\le b}\left( \varepsilon ^{(1)}_{ab} + (|\mu | + |\nu |)\{[a,b]^c\}\right) \!, \end{aligned}$$


$$\begin{aligned} \varepsilon ^{\mathrm{(K)}}_{ab}&:= \sum _{m=1}^M\gamma _m \alpha _{2t_m} (\rho \vee 1)^{-t_m/2} + (b-a+1)(\tilde{\alpha }_1 \varepsilon + \tilde{\alpha }_2 \eta );\\ \varepsilon ^{(1)}_{ab}&:= (b-a+1)\left\{ \sum _{m=1}^M\gamma _m \alpha _{1t_m}(\rho \vee 1)^{-(t_m+1)/2} + (\tilde{\alpha }_1 \varepsilon + \tilde{\alpha }_2 \eta )\right\} , \end{aligned}$$

with \(\alpha _{lt}\) as for Proposition 2.1 and with \(\gamma _m\) as in (3.2). If also \(\mu \) is a probability measure and \(\nu (\mathbb{Z })=1\), then

$$\begin{aligned} d_{\mathrm{K}}(\mu ,\nu )&\le 2\inf _{a\le b}\left( \varepsilon ^{\mathrm{(K)}}_{ab} + |\nu |\{[a,b]^c\}\right) \!;\\ \Vert \mu -\nu \Vert&\le \inf _{a\le b}\left( \varepsilon ^{(1)}_{ab} + \varepsilon ^{\mathrm{(K)}}_{ab} + 2|\nu |\{[a,b]^c\}\right) \!. \end{aligned}$$


The inequality for the total variation norm is immediate. For the Kolmogorov distance, by considering the possible positions of \(x\) in relation to \(a<b\), we have

$$\begin{aligned}&|\mu \{(-\infty ,x]\} - \nu \{(-\infty ,x]\}|\\&\quad \ \le \ \sup _{y < a}|\mu \{(-\infty ,y]\} - \nu \{(-\infty ,y]\}| + \sup _{a \le y \le b}|\mu \{[a,y]\} - \nu \{[a,y]\}|\\&\qquad + \sup _{y > b}|\mu \{(b,y]\} - \nu \{(b,y]\}| \\&\quad \ \le \ (|\mu | + |\nu |)\{(-\infty ,a)\cup (b,\infty )\} + \varepsilon ^{\mathrm{(K)}}_{ab}. \end{aligned}$$

If \(\mu \) is a probability measure and \(\nu (\mathbb{Z })=1\), we have

$$\begin{aligned} |\mu |\{[a,b]^c\} \ =\ 1 - \mu \{[a,b]\} \ \le \ |1 - \nu \{[a,b]\}| + \varepsilon ^{\mathrm{(K)}}_{ab} \ \le \ |\nu |\{[a,b]^c\} + \varepsilon ^{\mathrm{(K)}}_{ab}. \end{aligned}$$

\(\square \)

Sharper total variation approximation

When using Theorem 3.2, it can safely be assumed that the tails of the well-known measure \(\nu \) can be suitably bounded. However, taking \(\chi \) to be the characteristic function of the Poisson distribution \(\mathrm{Po\,}(\lambda )\), for example, as in the example of Sect. 2, the measure of the tail set \([a,b]^c\) cannot be small unless \(b-a\) is large in comparison to \(\sqrt{\lambda }\); in an asymptotic sense, as \(\lambda \rightarrow \infty \) and since \(\lambda \asymp \rho \), one would need at least \(\rho ^{-1/2}(b-a) \rightarrow \infty \). As a result, the quantity \(\varepsilon ^{(1)}_{ab}\) appearing in the bound on the total variation distance would necessarily be of larger asymptotic order than \(\sum _{m=1}^M\gamma _m \alpha _{2t_m} \rho ^{-t_m/2}\), which, in view of the bound on \(d_K\), would nonetheless seem to be the ‘natural’ order of approximation. Under somewhat stronger conditions than those of Theorem 3.1, a total variation bound of this order can be deduced (at least, if the quantities \(\varepsilon \) and \(\eta \) are also suitably small); the argument is reminiscent of that in [8].

We say that a characteristic function \(\chi \) is \((\rho ,\gamma ^{\prime },\theta _0)\)-smoothly locally normal if \(\chi (\theta ) := e^{i\zeta \theta -u(\theta )}\) for some \(\zeta = \zeta _\chi \in \mathbb{R }\), and for some twice differentiable function \(u\) such that \(u(0)=u^{\prime }(0)=0\), and that

$$\begin{aligned} |u^{\prime \prime }(\theta )| \le \gamma ^{\prime }\rho \quad \text{ and }\quad \mathfrak R \{u(\theta )\} \ge \rho \theta ^2, \quad |\theta | \le \theta _0. \end{aligned}$$

Taking \(\chi = p_\lambda \) to be the characteristic function of the Poisson distribution \(\mathrm{Po\,}(\lambda )\), for example, we can set \(\zeta _\chi =\lambda \) and \(u(\theta ) = \lambda (1-e^{i\theta } + i\theta )\), showing that \(p_\lambda \) is \((\rho ,\gamma ^{\prime },\pi )\)-smoothly locally normal with \(\rho = 2\lambda /\pi ^2\) and \(\gamma ^{\prime } = \pi ^2/2\).

For any \(\varepsilon ,\eta > 0\) and \(0 < \theta _0 \le \pi \), we then say that characteristic functions \(\phi _\mu \) and \(\phi _\nu \) are \((\varepsilon ,\eta ,\theta _0)\)-smoothly mod \(\chi \) polynomially close if \(\phi _\mu = \psi _\mu \chi \) and \(\phi _\nu = \psi _\nu \chi \), and that, for some \(M\ge 0\) and positive pairs \(\gamma _m,t_m, \,1\le m\le M\), there is a twice differentiable function \({\tilde{d}}_{\mu \nu }\) defined on \(|\theta | \le \theta _0\), for some \(0 < \theta _0 \le \pi /4\), such that \({\tilde{d}}_{\mu \nu }(0) = {\tilde{d}}_{\mu \nu }^{\prime }(0) = 0\) and

$$\begin{aligned}&|{\tilde{d}}_{\mu \nu }^{\prime \prime }(\theta )| \ \le \ \displaystyle \sum \limits _{m=1}^M \gamma _m|\theta |^{t_m-2},&\qquad |\theta | \le \theta _0; \end{aligned}$$
$$\begin{aligned}&e^{-\rho \theta ^2}|\psi _\mu (\theta ) - \psi _\nu (\theta ) - {\tilde{d}}_{\mu \nu }(\theta )| \ \le \ \varepsilon ,&\qquad |\theta | \le \theta _0; \end{aligned}$$
$$\begin{aligned}&|\phi _\mu (\theta ) - \phi _\nu (\theta )| \ \le \ \eta ,&\qquad \,\theta _0 < |\theta | \le \pi . \end{aligned}$$

Again, the smaller \(\varepsilon \) and \(\eta \), the better the bounds to be obtained.

Theorem 3.3

Let \(\mu \) and \(\nu \) be finite signed measures on \(\mathbb{Z }\), with characteristic functions \(\phi _\mu \) and \(\phi _\nu \) respectively. Suppose that \(\chi \) is \((\rho ,\gamma ^{\prime },\theta _0)\)-smoothly locally normal, and that \(\phi _\mu \) and \(\phi _\nu \) are \((\varepsilon ,\eta ,\theta _0)\)-smoothly mod \(\chi \) polynomially close. Assume also that \(\rho \ge 1\) and that \(\rho \theta _0^2 \ge \log \rho \). Then there is a function \(\alpha ^{\prime }:=\alpha ^{\prime }(t,\gamma )\) such that

$$\begin{aligned} \Vert \mu -\nu \Vert&\ \le {\sum _{m=1}^M}\gamma _m \alpha ^{\prime }(t_m,\gamma ^{\prime }) \rho ^{-t_m/2} + 3\rho \max \{\varepsilon ,\eta \}\\&+\, (|\mu |+|\nu |)\{(\lfloor \zeta _\chi \rfloor - \rho ,\lfloor \zeta _\chi \rfloor + \rho )^c\}, \end{aligned}$$

where \(\gamma _m\) and \(t_m\) are as in (3.5) and \(\gamma ^{\prime }\) is as in (3.4). If \(\mu \) is a probability measure and \(\nu (\mathbb{Z }) = 1\), then

$$\begin{aligned} \Vert \mu -\nu \Vert \ \le \ 2{\sum _{m=1}^M}\gamma _m \alpha ^{\prime }(t_m,\gamma ^{\prime }) \rho ^{-t_m/2} \!+\! 6\rho \max \{\varepsilon ,\eta \} \!+\! 2|\nu |\{(\lfloor \zeta _\chi \rfloor \!-\! \rho ,\lfloor \zeta _\chi \rfloor + \rho )^c\}. \end{aligned}$$

If (3.5) and (3.6) hold with \(\varepsilon =0\) for all \(0\le |\theta |\le \pi \), then there is a function \(\alpha ^*:=\alpha ^*(t,\gamma )\) such that

$$\begin{aligned} \Vert \mu -\nu \Vert \ \le \ {\sum _{m=1}^M}\gamma _m \alpha ^*(t_m,\gamma ^{\prime }) \rho ^{-t_m/2}. \end{aligned}$$

Writing \(H := {\sum _{m=1}^M}\gamma _m \rho ^{-(t_m+2)/2} + \max \{\varepsilon ,\eta \}\), it is clearly enough to show that, for any \(j \in (\lfloor \zeta _\chi \rfloor - \rho ,\lfloor \zeta _\chi \rfloor + \rho )\),

$$\begin{aligned} |\mu \{j\} - \nu \{j\}| \ =\ \frac{1}{2\pi }\left| \,\,\int _{-\pi }^\pi e^{-ij\theta }(\phi _\mu (\theta ) - \phi _\nu (\theta ))\,d\theta \right| \ \le \ KH, \end{aligned}$$

for some constant \(K\), giving a total contribution to the bound from such \(j\) of order \(O(\rho H)\). In view of (3.6) and (3.7), the main effort is to bound \(\int _{-\theta _0}^{\theta _0} e^{-\rho \theta ^2}|{\tilde{d}}_{\mu \nu }(\theta )|d\theta \); however, using (3.5) directly gives a bound of order \(O(\rho ^{1/2}H)\), which is too large. To get round this, for \(|j-\zeta _\chi |\) bigger than \(\rho ^{1/2}\), we write \(e^{-ij\theta }(\phi _\mu (\theta ) - \phi _\nu (\theta )) = e^{i(\zeta _\chi -j)\theta -u(\theta )}(\psi _\mu (\theta ) - \psi _\nu (\theta ))\), and integrate (3.8) twice by parts, to get a factor of \((j-\zeta _\chi )^2\) in the denominator. To make this argument work, we need to continue the function \({\tilde{w}}(\theta ) := e^{-u(\theta )}{\tilde{d}}_{\mu \nu }(\theta )\) into \(\theta _0 < |\theta | \le \pi \) in suitable fashion. For this, we use the following technical lemma, whose proof is given in the Appendix.

Lemma 3.4

Let \(w:(-\infty ,0] \rightarrow \mathbb{R }\) be such that \(w(0)=a\) and \(w^{\prime }(0) = b\). Then \(w\) can be continued differentiably on \([0,\infty )\) by a piecewise quadratic function such that \(|w^{\prime \prime }(x)| \le c\) for all \(x > 0\) for which \(w^{\prime \prime }(x)\) is defined, and such that \(w(x)=0\) for all

$$\begin{aligned} x \ \ge \ \frac{1}{c} \left\{ |b| + 2\sqrt{|ac + {\textstyle \frac{1}{2}}\mathrm{sgn}(b) b^2|} \right\} ; \end{aligned}$$

furthermore, \(\max _{x\ge 0} |w(x)| \le |a| + b^2/2c\).

We then write

$$\begin{aligned} \mu \{j\} - \nu \{j\} \ =\ \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta }\left\{ e^{-u(\theta )}[d_{\mu \nu }(\theta ) - {\tilde{d}}_{\mu \nu }(\theta )] + {\tilde{w}}(\theta ) \right\} d\theta ,\quad \end{aligned}$$

where \(d_{\mu \nu }:= \psi _\mu -\psi _\nu \), and, for each \(j\), bound the two parts of the final expression separately.

Proof of Theorem 3.3

(i). For the first step, we use Lemma 3.4 to continue the real and imaginary parts of \({\tilde{w}}(\theta )\) into \(\theta _0 \le |\theta | \le \pi \), in such a way that \({\tilde{w}}\) is piecewise twice differentiable on \([-\pi ,\pi ]\) and satisfies

$$\begin{aligned} {\tilde{w}}(-\pi ) \ =\ {\tilde{w}}(\pi ) \ =\ {\tilde{w}}^{\prime }(-\pi ) \ =\ {\tilde{w}}^{\prime }(\pi ) \ =\ 0, \end{aligned}$$

with the second derivatives of the real and imaginary parts suitably bounded. Since

$$\begin{aligned} {\tilde{w}}^{\prime }(\theta ) \ =\ e^{-u(\theta )}\{{\tilde{d}}_{\mu \nu }^{\prime }(\theta ) - u^{\prime }(\theta ){\tilde{d}}_{\mu \nu }(\theta )\}, \end{aligned}$$

it follows from (3.4) and (3.5) that

$$\begin{aligned} |{\tilde{w}}(\theta _0)|&\le {\sum _{m=1}^M}\frac{\gamma _m}{t_m(t_m-1)}\, \theta _0^{t_m} e^{-\rho \theta _0^2} \ \le \ {\sum _{m=1}^M}|a_m|; \end{aligned}$$
$$\begin{aligned} |{\tilde{w}}^{\prime }(\theta _0)|&\le {\sum _{m=1}^M}\frac{\gamma _m}{t_m(t_m-1)}\, \theta _0^{t_m-1} e^{-\rho \theta _0^2} \{ t_m + \gamma ^{\prime }\rho \theta _0^2\} \ \le \ {\sum _{m=1}^M}|b_m|, \end{aligned}$$


$$\begin{aligned} |a_m| \ :=\ t_m^{-1}\gamma _m \kappa _1(t_m,\gamma ^{\prime }) \theta _0^{t_m} e^{-\rho \theta _0^2},\quad |b_m| \ :=\ \gamma _m \kappa _1(t_m,\gamma ^{\prime }) \rho \theta _0^{t_m+1} e^{-\rho \theta _0^2},\qquad \end{aligned}$$

and \(\kappa _1(t,\gamma ) := (t+\gamma )/\{t(t-1)\}\). Hence we can continue \({\tilde{w}}\) in \(\theta _0 \le \theta \le \pi \) by a sum of functions \({\sum _{m=1}^M}{\tilde{w}}_m\), where \(|{\tilde{w}}_m(\theta _0)| \le |a_m|\) and \(|{\tilde{w}}^{\prime }_m(\theta _0)| \le |b_m|\) for each \(m\), and these bounds at \(\theta _0\) hold also for the real and imaginary parts \({\tilde{w}}_{mr}\) and \({\tilde{w}}_{mi}\) of \({\tilde{w}}\). Define \({\tilde{w}}_{mr}\) and \({\tilde{w}}_{mi}\) in \(\theta _0 \le \theta \le \pi \) using Lemma 3.4, in each case restricting their second derivatives by taking

$$\begin{aligned} c_m \ :=\ 4\gamma _m \kappa _1(t_m,\gamma ^{\prime }) \rho ^2 \theta _0^{t_m+2} e^{-\rho \theta _0^2}. \end{aligned}$$

Then it follows from the lemma that the length of the \(\theta \)-interval beyond \(\theta _0\) on which \({\tilde{w}}_m\) is not identically zero is bounded by

$$\begin{aligned} \frac{1}{c_m} \left\{ |b_m|(1+\sqrt{2}) + 2\sqrt{|a_m| c_m} \right\} \ \le \ \frac{1 + 3\sqrt{2}}{4\rho \theta _0} \ \le \ \ell \ :=\ \frac{2}{\rho \theta _0}, \end{aligned}$$

from (3.11) and (3.12), the bound being the same for all \(m\); note that

$$\begin{aligned} \ell \ \le \ \frac{2\theta _0}{\rho \theta _0^2} \ \le \ \frac{\pi }{2}, \end{aligned}$$

since \(\theta _0 \le \pi /4\) and \(\rho \theta _0^2 \ge 1\). From this and (3.14), and from the analogous continuation in \(-\pi \le \theta \le -\theta _0\), it follows also that

$$\begin{aligned} \int _{\theta _0 < |\theta | \le \pi } |{\tilde{w}}_m^{\prime \prime }(\theta )|\,d\theta \ \le \ 4\ell c_m \ \le \ 32 \gamma _m \kappa _1(t_m,\gamma ^{\prime }) \rho \theta _0^{t_m+1}e^{-\rho \theta _0^2}, \end{aligned}$$

and, using (3.11), (3.12), (3.13) and Lemma 3.4, that

$$\begin{aligned} \rho \int _{\theta _0 < |\theta | \le \pi } |{\tilde{w}}_m(\theta )|\,d\theta \ \le \ 4\ell \rho \{|a_m| + b_m^2/2c_m\} \ \le \ 5\, \gamma _m \kappa _1(t_m,\gamma ^{\prime }) \theta _0^{t_m-1}e^{-\rho \theta _0^2}.\nonumber \\ \end{aligned}$$

(ii). The next step is to bound the first part of the integral in (3.9). Here, by (3.6) and (3.7), we have \(|e^{-u(\theta )}[d_{\mu \nu }(\theta ) - {\tilde{d}}_{\mu \nu }(\theta )]| \ \le \ \varepsilon \) in \(|\theta | \le \theta _0\), whereas, in \(\theta _0 < |\theta | \le \pi \), it is bounded by \(\eta + |{\tilde{w}}(\theta )|\). Hence, for any \(j\), we use (3.17) to give

$$\begin{aligned}&\left| \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } e^{-u(\theta )}[d_{\mu \nu }(\theta ) - {\tilde{d}}_{\mu \nu }(\theta )] \,d\theta \right| \nonumber \\&\quad \ \le \ \max \{\varepsilon ,\eta \} + \frac{5}{2\pi \rho }{\sum _{m=1}^M}\gamma _m \kappa _1(t_m,\gamma ^{\prime }) \theta _0^{t_m-1}e^{-\rho \theta _0^2}. \end{aligned}$$

Noting also that, if \(\rho \theta ^2 \ge \log \rho \ge 0, \,\theta > 0\) and \(t\ge 2\), then

$$\begin{aligned} \rho ^{t/2}\theta ^{t-1}e^{-\rho \theta ^2} \ =\ \{\rho e^{-\rho \theta ^2/2}\}^{1/2}(\rho \theta ^2)^{(t-1)/2}e^{-\rho \theta ^2/2} \ \le \ k_2(t), \end{aligned}$$

for \(k_2(t) = \{(t-1)/e\}^{(t-1)/2}\), it follows that

$$\begin{aligned}&\left| \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } e^{-u(\theta )}[d_{\mu \nu }(\theta ) - {\tilde{d}}_{\mu \nu }(\theta )] \,d\theta \right| \nonumber \\&\quad \ \le \ \max \{\varepsilon ,\eta \} + \frac{5}{2\pi \rho }{\sum _{m=1}^M}\gamma _m \kappa _1(t_m,\gamma ^{\prime })k_2(t_m)\rho ^{-t_m/2}. \end{aligned}$$

This bounds the first element of (3.9) as \(O(H)\) for all \(j\).

(iii). For the second part of (3.9), we begin by considering values of \(j\) such that \(|j-\zeta _\chi | < 1 + \lceil \sqrt{\rho }\rceil \). Here, we write

$$\begin{aligned}&\left| \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}(\theta ) \,d\theta \right| \ \le \ \frac{1}{2\pi } \int _{-\pi }^\pi |{\tilde{w}}(\theta )|\,d\theta \\&\quad \ \le \ \frac{1}{2\pi } \int _{|\theta | \le \theta _0} e^{-\rho \theta ^2}|{\tilde{d}}_{\mu \nu }(\theta )|\,d\theta + \frac{1}{2\pi } {\sum _{m=1}^M}\int _{\,\,\theta _0 < |\theta | \le \pi } |{\tilde{w}}_m(\theta )|\,d\theta . \end{aligned}$$

Since, by (3.5),

$$\begin{aligned} |{\tilde{d}}_{\mu \nu }^{\prime }(\theta )| \ \le \ {\sum _{m=1}^M}\frac{\gamma _m}{t_m-1}|\theta |^{t_m-1} \quad \text{ and }\quad |{\tilde{d}}_{\mu \nu }(\theta )| \ \le \ {\sum _{m=1}^M}\frac{\gamma _m}{t_m(t_m-1)}|\theta |^{t_m},\qquad \end{aligned}$$

the first integral is bounded, as in the proof of Proposition 2.1, by

$$\begin{aligned} {\sum _{m=1}^M}\gamma _m \frac{\alpha _{1t_m}}{t_m(t_m-1)}\,\rho ^{-(t_m+1)/2}, \end{aligned}$$

and the second is bounded, as above, by

$$\begin{aligned} \frac{5}{2\pi \rho }{\sum _{m=1}^M}\gamma _m \kappa _1(t_m,\gamma ^{\prime })k_2(t_m)\rho ^{-t_m/2}, \end{aligned}$$

giving the bound

$$\begin{aligned}&\left| \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}(\theta ) \,d\theta \right| \nonumber \\&\quad \ \le \ {\sum _{m=1}^M}\gamma _m \left\{ \frac{\alpha _{1t_m}}{t_m(t_m-1)} + \frac{5}{2\pi }\kappa _1(t_m,\gamma ^{\prime })k_2(t_m) \right\} \rho ^{-(t_m+1)/2}, \end{aligned}$$

since also \(\rho \ge 1\). The bound is of order \(O(\rho ^{1/2}H)\), but there are only at most \(4\,+\,2\sqrt{\rho }\le 6\sqrt{\rho }\) integers \(j\) satisfying \(|j-\zeta _\chi | < 1 + \lceil \sqrt{\rho }\rceil \), so that their sum is of order \(O(\rho H)\), which is as required.

(iv). For \(|j-\zeta _\chi | \ge 1 + \lceil \sqrt{\rho }\rceil \), integrating twice by parts and using (3.10), it follows that

$$\begin{aligned} \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}(\theta ) \,d\theta \ =\ -\frac{1}{2\pi (j-\zeta _\chi )^2}\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}^{\prime \prime }(\theta ) \,d\theta ,\quad \end{aligned}$$


$$\begin{aligned} {\tilde{w}}^{\prime \prime }(\theta ) \ =\ \left( {\tilde{d}^{\prime \prime }_{\mu \nu }}{}(\theta ) - 2{\tilde{d}^{\prime }_{\mu \nu }}{}(\theta )u^{\prime }(\theta ) + {\tilde{d}_{\mu \nu }}(\theta )\{(u^{\prime }(\theta ))^2 - u^{\prime \prime }(\theta )\}\right) \,e^{-u(\theta )}\qquad \end{aligned}$$

in \(|\theta | \le \theta _0\). Hence, using (3.5), (3.20) and the fact that, from (3.4), \(|u^{\prime }(\theta )| \le \gamma ^{\prime }\rho |\theta |\) in \(|\theta |\le \theta _0\), the part of the integral in (3.23) for this range of \(\theta \) can be bounded by

$$\begin{aligned}&\left| \,\, \int _{\theta \le \theta _0} e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}^{\prime \prime }(\theta ) \,d\theta \right| \nonumber \\&\ \le \ {\sum _{m=1}^M}\; \int _{\,\theta \le \theta _0} \gamma _m \left\{ |\theta |^{t_m-2} \!+\! \frac{2\gamma ^{\prime }\rho }{t_m-1}\,|\theta |^{t_m} \!+\! \frac{\gamma ^{\prime }\rho }{t_m(t_m\!-\!1)}\,|\theta |^{t_m}(1 \!+\! \gamma ^{\prime }\rho \theta ^2)\right\} e^{-\rho \theta ^2}\,d\theta \nonumber \\&\ \le \ {\sum _{m=1}^M}\gamma _m\,\beta ^{\prime }(t_m,\gamma ^{\prime })\rho ^{-(t_m-1)/2}, \end{aligned}$$

after some calculation, where, with \(m_t\) as in Proposition 2.1,

$$\begin{aligned} \beta ^{\prime }(t,\gamma ^{\prime }) \ :=\ \frac{m_{t-2}}{4t\,2^{t/2}\sqrt{\pi }}\{4t+2(2t+1)\gamma ^{\prime } + (t+1)(\gamma ^{\prime })^2\}. \end{aligned}$$

The remaining part of the integral in (3.23), for \(\theta _0 < |\theta | \le \pi \), yields an additional element of

$$\begin{aligned} {\sum _{m=1}^M}\int _{\,\,\theta _0 < |\theta | \le \pi } |{\tilde{w}}^{\prime \prime }_m(\theta )|\,d\theta&\le 32 {\sum _{m=1}^M}\gamma _m \kappa _1(t_m,\gamma ^{\prime }) \rho \theta _0^{t_m+1}e^{-\rho \theta _0^2}\nonumber \\&\le 32 {\sum _{m=1}^M}\gamma _m \kappa _1(t_m,\gamma ^{\prime })\,k_3(t_m)\rho ^{-(t_m-1)/2}, \end{aligned}$$

from (3.16), with \(k_3(t) := \{(t+1)/2e\}^{(t+1)/2}\). As a result, we find that, for \(|j-\zeta _\chi | \ge 1 + \lceil \sqrt{\rho }\rceil \), the second part of (3.9) can be bounded by

$$\begin{aligned}&\left| \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}(\theta ) \,d\theta \right| \nonumber \\&\quad \le \frac{1}{(j-\zeta _\chi )^2} {\sum _{m=1}^M}\gamma _m\left\{ \beta ^{\prime }(t_m,\gamma ^{\prime }) + \frac{16}{\pi }\kappa _1(t_m,\gamma ^{\prime })\,k_3(t_m)\right\} \rho ^{-(t_m-1)/2},\qquad \quad \end{aligned}$$

and adding over \(|j-\zeta _\chi | \ge 1 + \lceil \sqrt{\rho }\rceil \) gives a contribution of order \(O(\rho H)\).

(v). The final step is to make the arbitrary choice \(s = \rho \) in the bound

$$\begin{aligned} \Vert \mu -\nu \Vert \ \le \ \sum _{|j-\lfloor \zeta _\chi \rfloor | < s}|\mu \{j\} - \nu \{j\}| + (|\mu | + |\nu |)\{(\lfloor \zeta _\chi \rfloor -s,\lfloor \zeta _\chi \rfloor +s)^c\}, \end{aligned}$$

and to note that, if \(\mu \) is a probability measure and \(\nu (\mathbb{Z }) = 1\), then

$$\begin{aligned} |\mu |\{(a,b)^c\}&= 1 - \mu \{(a,b)\} \ \le \ |1 - \nu \{(a,b)\}| + |\nu \{(a,b)\} - \mu \{(a,b)\}| \\&\le |\nu |\{(a,b)^c\} + \sum _{a < j < b} |\mu \{j\} - \nu \{j\}|. \end{aligned}$$

(vi). If (3.5) and (3.6) hold with \(\varepsilon =0\) for all \(0\le |\theta |\le \pi \) (implying, in particular, that \(\eta \) is irrelevant), the proof simplifies dramatically. The considerations concerning \(\theta _0 < |\theta | \le \pi \) become unnecessary. This leaves the bound

$$\begin{aligned} |\mu (j)-\nu (j)| \ =\ \left| \frac{1}{2\pi }\int _{-\pi }^\pi e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}(\theta ) \,d\theta \right| \ \le \ {\sum _{m=1}^M}\frac{\gamma _m\alpha _{1t_m}}{t_m(t_m-1)} \rho ^{-(t_m+1)/2}\quad \qquad \end{aligned}$$

for \(|j-\zeta _\chi | < 1 + \lceil \sqrt{\rho }\rceil \), where \({\tilde{w}}(\theta ) = e^{-u(\theta )}d_{\mu \nu }(\theta )\). Then, since \(e^{-i(j-\zeta _\chi )\theta } {\tilde{w}}(\theta )\) is a \(2\pi \)-periodic function, the integration by parts in (3.23) remains true, giving the bound

$$\begin{aligned} |\mu (j)-\nu (j)| \ \le \ \frac{1}{(j-\zeta _\chi )^2} {\sum _{m=1}^M}\gamma _m \beta ^{\prime }(t_m,\gamma ^{\prime }) \rho ^{-(t_m-1)/2} \end{aligned}$$

for \(|j-\zeta _\chi | \ge 1 + \lceil \sqrt{\rho }\rceil \). Adding over all \(j\) gives the final bound, with \(\alpha ^*(t,\gamma ^{\prime }) := 2\beta ^{\prime }(t,\gamma ^{\prime }) + {6\alpha _{1t}}/\{t(t-1)\}\). \(\square \)

In certain applications, the difference \(d_{\mu \nu }\) is expressed in the form \(d_{\mu \nu }(\theta ) = {\hat{d}}_{\mu \nu }(e^{i\theta }-1)\). If it is true that \({\hat{d}}_{\mu \nu }(0) = {\hat{d}}_{\mu \nu }^{\prime }(0) = 0\) and \(|{\hat{d}}_{\mu \nu }^{\prime \prime }(w)| \le {\hat{\gamma }}|w|^{t-2}\) for complex \(w\) such that \(|w| \le \theta _0\), then it follows that \(d_{\mu \nu }(0) = d_{\mu \nu }^{\prime }(0) = 0\) and that

$$\begin{aligned} |d_{\mu \nu }^{\prime \prime }(\theta )| \ \le \ \Bigl (1 + \frac{2\wedge \theta _0}{t-1} \Bigr ){\hat{\gamma }}|\theta |^{t-2},\qquad |\theta | \le \theta _0. \end{aligned}$$

Approximating probability distributions

The general case

The most common application of the general bounds is when \(\mu \) is a probability distribution which is close to a member \(R_\lambda \) of a family \(\{R_\lambda ,\,\lambda >0\}\) of probability distributions on the integers, and one is interested in bounds when \(\lambda \) is large. Suppose, in particular, that the characteristic function \(r_\lambda \) of \(R_\lambda \) is \((\rho ,\gamma ^{\prime },\pi )\)-smoothly locally normal, and that \(\phi _\mu = \psi r_\lambda \), where \(\psi \) has a polynomial approximation \({\tilde{\psi }}_r\) as given in (2.4), for some \(r \in \mathbb{N }\) and \(\tilde{a}_1,\ldots ,\tilde{a}_r \in \mathbb{R }\). This indicates that \(\mu \) may be close to \(\nu = \nu _r(R_\lambda ;\tilde{a}_1,\ldots ,\tilde{a}_r)\) given in (2.6). The following corollary, in which we use a more probabilistic notation for \(\mu \), establishes the corresponding results.

Corollary 4.1

Let \(X\) be an integer valued random variable with distribution \(P_X\) and characteristic function \(\phi _X := \psi r_\lambda \), where \(r_\lambda \) is a \((\rho ,\gamma ^{\prime },\theta _0)\)-smoothly locally normal characteristic function and \(\rho \ge 1\). Let \({\tilde{\psi }}_r\) be as in (2.4). Then, if \(\phi _X\) and \({\tilde{\psi }}_r r_\lambda \) are \((\varepsilon ,\eta ,\theta _0)\)-mod \(r_\lambda \) polynomially close, it follows that

  1. 1.

    \(d_{\mathrm{loc}}(P_X,\nu _r) \ \le \ {\sum _{m=1}^M}\gamma _m \alpha _{1t_m} \rho ^{-(t_m+1)/2} + \tilde{\alpha }_1 \varepsilon + \tilde{\alpha }_2 \eta ;\)

  2. 2.

    \(d_K(P_X,\nu _r) \ \le \ 2\inf _{a\le b}\left( \varepsilon ^{\mathrm{(K)}}_{ab} + |\nu _r|\{[a,b]^c\}\right) \!;\)

  3. 3.

    \(\Vert P_X - \nu _r\Vert \ \le \ \inf _{a\le b}\left( \varepsilon ^{(1)}_{ab} + \varepsilon ^{\mathrm{(K)}}_{ab} + 2|\nu _r|\{[a,b]^c\}\right) \!,\)

where the quantities appearing in the bounds are as in Theorem 3.2, and with \(\nu _r \ =\ \nu _r(R_\lambda ;\tilde{a}_1,\ldots ,\tilde{a}_r)\) as defined in (2.6). Furthermore, if \(\phi _X\) and \({\tilde{\psi }}_r r_\lambda \) are \((\varepsilon ,\eta ,\theta _0)\)-smoothly mod \(r_\lambda \) polynomially close, then

  1. 4.
    $$\begin{aligned}&\Vert P_X - \nu _r\Vert \le 2{\sum _{m=1}^M}\gamma _m \alpha ^{\prime }(t_m,\gamma ^{\prime }) \rho ^{-t_m/2} + 6\rho \max \{\varepsilon ,\eta \}\\&\qquad +\, 2|\nu _r|\{(\lfloor \zeta _{r_\lambda }\rfloor -\rho ,\lfloor \zeta _{r_\lambda }\rfloor + \rho )^c\}, \end{aligned}$$

and, if (3.5) and (3.6) hold with \(\varepsilon =0\) for all \(0\le |\theta |\le \pi \), then

  1. 5.

    \( \Vert P_X - \nu _r\Vert \ \le \ {\sum _{m=1}^M}\gamma _m \alpha ^*(t_m,\gamma ^{\prime }) \rho ^{-t_m/2}\).


Taking \(\psi _\mu = 0\) and \(\psi _\nu = (e^{i\theta }-1)^l\) in Theorem 3.3 for \(l\ge 2\) gives \(|d^{\prime \prime }_{\mu \nu }(\theta )| \le l(l+1)|\theta |^{l-2}\) for all \(\theta \) and \(d^{\prime }_{\mu \nu }(0)=0\), where \(d_{\mu \nu }(\theta ) = \psi _m(\theta ) - \psi _n(\theta )\). Hence, by the final part of the theorem, the contribution from the \(l\)-th term in the signed measure \(\nu _r\) of (2.6) has total variation norm at most \(\alpha ^*(l,\gamma ^{\prime }) l(l+1) |\tilde{a}_l| \rho ^{-l/2}\), for \(2\le l\le r\).

Probability distributions as approximations

The use of signed measures to approximate probability distributions is convenient, but not very natural. However, the signed measures \(\nu _1(R_\lambda ;\tilde{a}_1)\) and \(\nu _2(R_\lambda ;\tilde{a}_1,\tilde{a}_2)\) can often be replaced by suitably translated members of the family \(\{R_\lambda ,\,\lambda > 0\}\), with the same asymptotic rate of approximation, by fitting the first two moments, a procedure analogous to that used in the Berry–Esseen theorem. We accomplish this under some further mild assumptions on the distributions \(R_\lambda \).

We call the family \(\{R_\lambda ,\,\lambda >0\}\) amenable if the following three conditions are satisfied. First, the characteristic functions \(r_\lambda \) are to be \((\rho (\lambda ),\gamma ^{\prime },\pi )\)-smoothly locally normal (with the same value of \(\gamma ^{\prime }\) for all), where \(\lim _{\lambda \rightarrow \infty }\rho (\lambda ) = \infty \); secondly, if \(b_1 := b_1(\lambda ,\lambda ^{\prime })\) and \(b_2 := b_2(\lambda ,\lambda ^{\prime })\) are chosen to make the first two derivatives of the function

$$\begin{aligned} w_{\lambda ,\lambda ^{\prime }}(\theta ) \ :=\ r_{\lambda ^{\prime }}(\theta ) - r_\lambda (\theta )\{1 + b_1(e^{i\theta }-1) + b_2(e^{i\theta }-1)^2\} \end{aligned}$$

vanish at zero (\(w_{\lambda ,\lambda ^{\prime }}(\theta ) = 0\) is automatic), then \(\delta _{\lambda ,\lambda ^{\prime }}(\theta ) := w_{\lambda ,\lambda ^{\prime }}(\theta )/r_\lambda (\theta )\) is to satisfy

$$\begin{aligned} |\delta _{\lambda ,\lambda ^{\prime }}^{\prime \prime }(\theta )| \ \le \ f(|\lambda -\lambda ^{\prime }|)|\theta |, \quad |\theta | \le \pi , \end{aligned}$$

for some continuous function \(f:\,\mathbb{R }_+ \rightarrow \mathbb{R }_+\); and thirdly, if \(Z_\lambda \sim R_\lambda \), then \(\mu (\lambda ) := \mathbb{E }Z_\lambda \) and \(\sigma ^2(\lambda ) := \mathrm{Var\,}Z_\lambda \) should exist, with \(\sigma ^2(\cdot )\) strictly increasing from zero to infinity, and the functions \(\mu (\cdot ), \,\sigma ^2(\cdot )\) and \((\sigma ^2)^{-1}(\cdot )\) are all to be uniformly continuous.

The quantities \(b_1\) and \(b_2\), as defined in (4.1), can be explicitly expressed:

$$\begin{aligned} b_1(\lambda ,\lambda ^{\prime }) \ =\ \mu (\lambda ^{\prime })-\mu (\lambda );\quad b_2(\lambda ,\lambda ^{\prime }) \ =\ {\textstyle \frac{1}{2}}\{\sigma ^2(\lambda ^{\prime })-\sigma ^2(\lambda ) - b_1(1-b_1)\},\nonumber \\ \end{aligned}$$

and it follows from (4.2) that

$$\begin{aligned} |\delta _{\lambda ,\lambda ^{\prime }}(\theta )| \ \le \ {\textstyle \frac{1}{6}}f(|\lambda -\lambda ^{\prime }|)|\theta |^3, \quad |\theta | \le \pi . \end{aligned}$$

Note that the Poisson family \(\{\mathrm{Po\,}(\lambda )\,\lambda >0\}\) is amenable.

Now the signed measures \(\nu _r, \,r\ge 2\), have mean and variance given by

$$\begin{aligned} \mu _* = \mu (\lambda )+\tilde{a}_1 ;\quad \sigma ^2_* \ =\ \sigma ^2(\lambda ) + 2\tilde{a}_2 + \tilde{a}_1(1-\tilde{a}_1), \end{aligned}$$

and the corresponding equations for \(\nu _1\) just have \(\tilde{a}_2=0\). However, when choosing a translation of \(R_\lambda \) to match these moments, only integer translations \(m\) of \(R_\lambda \) can be allowed, since the distributions must remain on the integers, and so it is not possible to match both moments exactly within the family. To circumvent this, we extend to approximation by a member of the family of probability distributions \(Q_{mp}(R_{\lambda ^{\prime }})\), for \(\lambda ^{\prime }>0, \,m\in \mathbb{Z }\) and \(0\le p < 1\), where

$$\begin{aligned} Q_{mp}(R_{\lambda ^{\prime }})\{j\} \ :=\ pR_{\lambda ^{\prime }}\{j-m-1\} + (1-p)R_{\lambda ^{\prime }}\{j-m\}. \end{aligned}$$

If \(Z \sim R_{\lambda ^{\prime }}\), then \(Q_{mp}(R_{\lambda ^{\prime }})\) is the distribution of \(Z+m+I\), where \(I \sim \mathrm{Be\,}(p)\) is independent of \(Z\). \(Q_{mp}(R_{\lambda ^{\prime }})\) has characteristic function \(q_{mp}(R_{\lambda ^{\prime }})\) given by

$$\begin{aligned} q_{mp}(R_{\lambda ^{\prime }})(\theta ) \ :=\ e^{im\theta }(1+p(e^{i\theta }-1))r_{\lambda ^{\prime }}(\theta ), \end{aligned}$$

similar to the measure \(\nu _2\{R_{\lambda ^{\prime }};m+p,{m\atopwithdelims ()2}+mp\}\), but with terms of higher order as powers of \((e^{i\theta }-1)\) as well.

Among the distributions \(\{Q_{mp}(R_{\lambda ^{\prime }});\, \lambda ^{\prime } > 0, m\in \mathbb{Z }, 0 \le p < 1\}\), we can always find one having a given mean \(\mu _*\) and variance \(\sigma ^2_*\), provided that \(\{R_\lambda ,\,\lambda >0\}\) is amenable and that \(\sigma ^2_* \ge 1/4\), by solving the equations

$$\begin{aligned} \mu _* = \mu (\lambda ^{\prime }) + m + p; \quad \sigma ^2_* \ =\ \sigma ^2(\lambda ^{\prime }) + p(1-p). \end{aligned}$$

To do so, let \(\lambda _p\) solve \(\sigma ^2(\lambda _p) = \sigma ^2_* - p(1-p)\), possible for \(0\le p \le 1\), since \(\sigma ^2_* \ge 1/4\) and the function \(\sigma ^2\) has an inverse; note also that \(\lambda _0 \!=\! \lambda _1\). Define \(m_p := \mu _* \!-\! \mu (\lambda _p) - p\), continuous under the assumptions on \(\sigma ^2\), and observe that \(m_0 = m_1 + 1\). Hence the value \(m = \lfloor m_0 \rfloor \) is realized in the form \(m_p\) for some \(0 \le p < 1\), and then \(\lambda _p, \,m_p\) and \(p\) satisfy (4.8). In the Poisson case, for instance, this gives

$$\begin{aligned} m&= \lfloor \tilde{a}_1^2 - 2\tilde{a}_2 \rfloor ; \quad p^2 \ =\ \langle \tilde{a}_1^2 - 2\tilde{a}_2 \rangle ;\nonumber \\ \lambda ^{\prime }&= \lambda + 2\tilde{a}_2 + \tilde{a}_1(1-\tilde{a}_1) - p(1-p), \end{aligned}$$

where \(\langle x\rangle \) denotes the fractional part of \(x\).

Suppose now that we have an approximation of a distribution \(P_X\) by some measure \(\nu _r(R_\lambda ;\tilde{a}_1,\ldots ,\tilde{a}_r)\), for \(r\ge 2\), and with \(\rho (\lambda ) \ge 1\). We wish to show that \(Q_{mp}(R_{\lambda ^{\prime }})\) and \(\nu _r = \nu _r(R_\lambda ;\tilde{a}_1,\ldots ,\tilde{a}_r)\) are close to order \(O\{\rho (\lambda )^{-3/2}\}\), if \(\lambda ^{\prime },m\) and \(p\) are suitably chosen. Matching the first two moments, the choices of \(\lambda ^{\prime },m\) and \(p\) in (4.8) when \(\mu _*\) and \(\sigma ^2_*\) are given by (4.5) are such as to give

$$\begin{aligned} \mu (\lambda ^{\prime }) + m + p \ =\ \mu (\lambda ) + \tilde{a}_1; \quad \sigma ^2(\lambda ^{\prime }) + p(1-p) \ =\ \sigma ^2(\lambda ) + 2\tilde{a}_2 + \tilde{a}_1(1-\tilde{a}_1), \end{aligned}$$

implying that, for \(b_1 := b_1(\lambda ,\lambda ^{\prime })\) and \(b_2 := b_2(\lambda ,\lambda ^{\prime })\) given in (4.3),

$$\begin{aligned} b_1 + m + p \ =\ \tilde{a}_1 ;\qquad b_2 + (m+p)b_1 + mp + \left( {\begin{array}{c}m\\ 2\end{array}}\right) \ =\ \tilde{a}_2; \end{aligned}$$

note also that \(m, \,|\lambda ^{\prime }-\lambda |, \,|\mu (\lambda ^{\prime }) - \mu (\lambda )|\) and \(|\sigma ^2(\lambda ^{\prime }) - \sigma ^2(\lambda )|\) are uniformly bounded for \((\tilde{a}_1,\tilde{a}_2)\) in any compact set. Now, from the definition of \(\delta _{\lambda ,\lambda ^{\prime }}\) and from (4.7), \(q_{mp}(R_{\lambda ^{\prime }})(\theta )\) can be written as \(r_\lambda (\theta )\psi _{\lambda ,\lambda ^{\prime }}(\theta )\), with

$$\begin{aligned} \psi _{\lambda ,\lambda ^{\prime }}(\theta ) \ =\ \{\delta _{\lambda ,\lambda ^{\prime }}(\theta ) + [1 + b_1(e^{i\theta }-1) + b_2(e^{i\theta }-1)^2 ] \} e^{i\theta m}(1 + p(e^{i\theta }-1)).\nonumber \\ \end{aligned}$$

However, in view of (4.10),

$$\begin{aligned} (1 + b_1w + b_2w^2)(1+w)^m(1+pw) - (1+\tilde{a}_1w + \tilde{a}_2w^2) \end{aligned}$$

is a polynomial in \(w\) that begins with the \(w^3\)-term, so that

$$\begin{aligned} {\hat{d}}(\theta ) \ :=\ [1 + b_1(e^{i\theta }-1) + b_2(e^{i\theta }-1)^2 ] e^{i\theta m}(1 + p(e^{i\theta }-1)) - {\tilde{\psi }}_r(\theta )\qquad \quad \end{aligned}$$


$$\begin{aligned} {\hat{d}}(0) \ =\ {\hat{d}}^{\prime }(0) \ =\ 0; \quad |{\hat{d}}^{\prime \prime }(\theta )| \le {\hat{\gamma }}|\theta |, \text{ in } |\theta | \le \pi , \end{aligned}$$

where \({\hat{\gamma }}= {\hat{\gamma }}(\tilde{a}_1,\ldots ,\tilde{a}_r)\) remains bounded if \(\tilde{a}_1,\ldots ,\tilde{a}_r\) do. In view of (4.4) and (4.11)–(4.13), \(q_{mp}\) and \({\tilde{\psi }}_r r_\lambda \) are \((0,0,\pi )\)-smoothly mod \(r_\lambda \) polynomially close, with \(M=1\) and \(t_1=3\), for a constant \(\gamma _1 = \gamma _1(\tilde{a}_1,\ldots ,\tilde{a}_r)\), whose definition depends on the family \(R_\lambda \). In view of Corollary 4.1(5), this proves the following result.

Proposition 4.2

Suppose that the family \(\{R_\lambda ,\,\lambda >0\}\) is amenable, and that \(\lambda ^{\prime }, \,m\) and \(p\) are chosen to satisfy (4.8) for \(\mu _*\) and \(\sigma ^2_*\) given by (4.5). Then

$$\begin{aligned} \Vert Q_{mp}(R_{\lambda ^{\prime }}) - \nu _r(R_\lambda ;\tilde{a}_1,\ldots ,\tilde{a}_r)\Vert \ \le \ \alpha ^{\prime }\gamma _1 \rho (\lambda )^{-3/2}. \end{aligned}$$

Thus the signed measure \(\nu _r(R_\lambda ;\tilde{a}_1,\ldots ,\tilde{a}_r)\) can be replaced as approximation by the probability distribution \(Q_{mp}(R_{\lambda ^{\prime }})\) with an additional error in total variation of order at most \(O(\rho (\lambda )^{-3/2})\).

Suppose that, instead of having a bound on \(d_{\mu \nu }:= \psi - {\tilde{\psi }}_r\), we are given an approximation to \(\psi \) by a Taylor expansion \(\psi _r(\theta ) := \sum _{l=0}^ra_l (i\theta )^l\) around \(\theta =0\), for real coefficients \(a_l\) (and with \(a_0=1\)) and some \(r\in \mathbb{N }_0\). Then, equating coefficients of \(i\theta \), it follows that

$$\begin{aligned} |\psi _r(\theta ) - {\tilde{\psi }}_r(\theta )| \ \le \ U_{r} |\theta |^{r+1},\quad |\theta | \le \pi , \end{aligned}$$

for \(U_r := U_r(a_1,\ldots ,a_r)\), if \(\tilde{a}_1,\ldots ,\tilde{a}_r\) are defined implicitly by

$$\begin{aligned} a_j \ :=\ \sum _{l=1}^j \tilde{a}_l\sum _{(s_1,\ldots ,s_l)\in S_{j-l}} \prod _{t=1}^l \frac{1}{(s_t+1)!}, \end{aligned}$$

where \(S_m := \{(s_1,\ldots ,s_l):\,\sum _{t=1}^l s_t = m\}\). Hence we can replace any bound on the difference \(\psi - \psi _r\) by a corresponding bound on \(d_{\mu \nu }\) in the assumptions of the theorems, in which the original bound is increased by \(U_{r} |\theta |^{r+1}\). This will typically not change the order of the approximation obtained.

Sometimes it is convenient, for simplicity, to use parameters in the expansions that are not those emerging naturally from the proofs. Under the conditions on the family \(\{R_\lambda ,\,\lambda >0\}\) imposed in this section, this is easy to accommodate. For instance, suppose that, for \(|\theta | \le \pi \),

$$\begin{aligned} \phi _\mu \ :=\ r_\lambda A;\quad \phi _{\nu ^{(1)}} \ :=\ r_\lambda A^{\prime };\quad \phi _{\nu ^{(2)}} \ :=\ r_{\lambda ^{\prime }}A, \end{aligned}$$

with \(A(\theta ) := 1 + \sum _{l=1}^ra_l (e^{i\theta }-1)^l, \,A^{\prime }(\theta ) := 1 + \sum _{l=1}^ra^{\prime }_l (e^{i\theta }-1)^l\) and with \(\lambda > \lambda ^{\prime }\). Then \(d_{\mu \nu }^{(1)}:= A - A^{\prime }\) satisfies

$$\begin{aligned} |d_{\mu \nu }^{(1)}(\theta )| \ \le \ \sum _{l=1}^r|a_l - a^{\prime }_l| \,|\theta |^l, \quad 0 < |\theta | \le \pi , \end{aligned}$$

enabling \(\phi _\mu \) to be replaced by \(\phi _{\nu ^{(1)}}\) in exchange for an error that can be bounded using Corollary 4.1. Similarly, setting \(d_{\mu \nu }^{(2)}:= A(1 - r_{\lambda ^{\prime }}/r_\lambda )\), we have

$$\begin{aligned} |d_{\mu \nu }^{(2)}(\theta )| \ \le \ {\tilde{f}}(|\lambda -\lambda ^{\prime }|)|\theta |\left\{ 1 + \sum _{l=1}^r|a_l| \,|\theta |^l \right\} , \quad 0 < |\theta | \le \pi , \end{aligned}$$

in view of (4.2), where \({\tilde{f}}:\,\mathbb{R }_+\rightarrow \mathbb{R }_+\) is continuous.

Poisson–Charlier expansions

As observed above, the Poisson family satisfies all the requirements placed on the family \(\{R_\lambda ,\,\lambda >0\}\) in the previous section, so all the results of that section can be carried across. In this case, the signed measures \(\nu _r\) on \(\mathbb{N }_0\) have the explicit representation

$$\begin{aligned} \nu _r\{j\} \ :=\ \nu _r(\mathrm{Po\,}(\lambda );\tilde{a}_1,\ldots ,\tilde{a}_r)\{j\} \ :=\ \mathrm{Po\,}(\lambda )\{j\}\left\{ 1 + \sum _{l=1}^r(-1)^l\tilde{a}_lC_l(j;\lambda ) \right\} ,\nonumber \\ \end{aligned}$$


$$\begin{aligned} C_l(j;\lambda ) \ :=\ \sum _{k=0}^l(-1)^k{l\atopwithdelims ()k}{j\atopwithdelims ()k}k!\,\lambda ^{-k} \end{aligned}$$

denotes the \(l\)-th Charlier polynomial [2, (1.9), p. 171].

Note that, if \({j\atopwithdelims ()k}\) is replaced by \(j^k/k!\) in (5.2), one obtains the binomial expansion of \((1-j/\lambda )^l\). As this suggests, the values of \(C_l(j;\lambda )\) are in fact small for \(j\) near \(\lambda \) if \(\lambda \) is large:

$$\begin{aligned} |C_l(j;\lambda )| \ \le \ 2^{l-1}\{|1-j/\lambda |^l + (l/\sqrt{\lambda })^l\} \end{aligned}$$

[1, Lemma 6.1]. The equation (5.3) thus implies that, in any interval of the form \(|j-\lambda | \le c\sqrt{\lambda }\), which is where the probability mass of \(\mathrm{Po\,}(\lambda )\) is mostly to be found, the correction to the Poisson measure \(\mathrm{Po\,}(\lambda )\) is of uniform relative order \(O(\lambda ^{-l/2})\). Indeed, the Chernoff inequalities for \(Z \sim \mathrm{Po\,}(\lambda )\) can be expressed in the form

$$\begin{aligned}&\max \{\mathbb{P }[Z > \lambda (1+\delta )], \mathbb{P }[Z < \lambda (1-\delta )]\} \nonumber \\&\quad \ \le \ \exp \{-\lambda \delta ^2/2(1+\delta /3)\} \ \le \ \exp \{-\lambda \delta ^2/3(\delta \vee 1)\}, \end{aligned}$$

[3, Theorem 3.2]. Since also, from (5.2),

$$\begin{aligned} |C_l(j;\lambda )| \ \le \ (1+j/\lambda )^l \ \le \ 2^l \quad \text{ if }\,\, 0\le j\le \lambda , \end{aligned}$$

and since

$$\begin{aligned} {j\atopwithdelims ()k}k!\,\lambda ^{-k}\,\frac{e^{-\lambda }\lambda ^j}{j!} \ =\ \frac{e^{-\lambda }\lambda ^{j-k}}{(j-k)!} \ \le \ \frac{e^{-\lambda }\lambda ^{j-l}}{(j-l)!} \end{aligned}$$

if \(0\le k\le l\) and \(j \ge l + \lambda \), it follows that, for any \(l\ge 0\), we have

$$\begin{aligned} \sum _{j=0}^m |C_l(j;\lambda )|\mathrm{Po\,}(\lambda )\{j\} \ \le \ 2^l \mathbb{P }[Z \le m] \ \le \ 2^l \exp \{-(\lambda -m)^2/3\lambda \} \end{aligned}$$

for \(m\le \lambda ,\) and, for \(l\le r\) and \(m\ge \lambda + r\),

$$\begin{aligned}&\sum _{j\ge m} |C_l(j;\lambda )|\mathrm{Po\,}(\lambda )\{j\} \ \le \ 2^l \mathbb{P }[Z \ge m-l] \ \le \ 2^l \mathbb{P }[Z \ge m-r]\\&\quad \ \le \ 2^l \left\{ \exp \{-(m-r-\lambda )^2/3\lambda \} \vee \exp \{-(m-r-\lambda )/3\} \right\} . \end{aligned}$$

It thus follows that

$$\begin{aligned} |\nu _r|\{[0,m]\}&\le \bar{A}_r\, e^{-(\lambda -m)^2/3\lambda },\quad 0\le m\le \lambda ; \nonumber \\ |\nu _r|\{[m,\infty )\}&\le \bar{A}_r\, \left\{ e^{-(m-r-\lambda )^2/3\lambda } \vee e^{-(m-r-\lambda )/3} \right\} ,\quad m \ge \lambda +r, \end{aligned}$$

where \(\bar{A}_r := 1 + \sum _{l=1}^r2^l|\tilde{a}_l|\), demonstrating exponential concentration of measure for \(\nu _r\) on a scale of \(\sqrt{\lambda }\) around \(\lambda \). Moreover, it can be deduced from (5.3) that there exists a positive constant \(d = d(\tilde{a}_1,\ldots ,\tilde{a}_r)\) such that \(\nu _r\{j\} \ge 0\) for \(|j-\lambda | \le d\lambda \), and it follows from (5.5) that \(|\nu _r|\{j:\,|j-\lambda | > d\lambda \} = O(e^{-\alpha \lambda })\) for some \(\alpha > 0\). Since also \(\nu _r\{\mathbb{N }_0\} = 1\), it thus follows that, even if \(\nu _r\) is formally a signed measure, it differs from a probability only on a set of measure exponentially small with \(\lambda \).

Since the measures \(\nu _r\) are so well concentrated, the bounds in Corollary 4.1(2–4) can be made more specific. We give as example a theorem deriving from Part 3, under the simplest conditions.

Theorem 5.1

Suppose that \(X\) is as above, having characteristic function \(\phi _X := \psi p_{\lambda }\), and that (2.5) holds; write \(t = r + \delta \).

If \(\lambda \ge 1\), there is a constant \(\alpha _{4t} = \alpha _{4t}(\tilde{a}_0,\ldots ,\tilde{a}_r)\) such that

$$\begin{aligned} \Vert P_X-\nu _r\Vert \ \le \ \alpha _{4t} K_{r\delta }\lambda ^{-t/2} \max \left\{ 1,\sqrt{|\log K_{r\delta }|},\sqrt{\log (\lambda +1)}\right\} ; \end{aligned}$$

if \(\lambda < 1\), then there is a constant \(\alpha _{5t} = \alpha _{5t}(\tilde{a}_0,\ldots ,\tilde{a}_r)\) such that

$$\begin{aligned} \Vert P_X-\nu _r\Vert \ \le \ \alpha _{5t} K_{r\delta } \lambda ^{-t/2}\max \left\{ 1,|\log K_{r\delta }|\right\} \!. \end{aligned}$$


Of course, for the bound in (5.7) to be of use, \(K_{r\delta }\) should be small.


For \(\lambda \ge 1\), we use both parts of (5.5), taking

$$\begin{aligned} a \ :=\ \lfloor \lambda - c_{r\lambda }\sqrt{\lambda \log (\lambda +1)} \rfloor \quad \text{ and }\quad b \ :=\ \lceil \lambda + r + c_{r\lambda }\sqrt{\lambda \log (\lambda +1)} \rceil , \end{aligned}$$

where \(\lfloor x \rfloor \le x \le \lceil x \rceil \) denote the integers closest to \(x\), and with

$$\begin{aligned} c_{r\lambda } \ :=\ 3\{(r+1)/2 + |\log K_{r\delta }|/\log (\lambda +1)\}. \end{aligned}$$

If \(r + c_{r\lambda }\sqrt{\lambda \log (\lambda +1)} \le \lambda \), we obtain

$$\begin{aligned} |\nu _r|\{[a,b]^c\} \ \le \ 2\bar{A}_r(\lambda +1)^{-c_{r\lambda }^2/3} \ \le \ 2\bar{A}_r(\lambda +1)^{-c_{r\lambda }/3}, \end{aligned}$$

since \(c_{r\lambda } \ge 1\), and, if \(r + c_{r\lambda }\sqrt{\lambda \log (\lambda +1)} > \lambda \), we get

$$\begin{aligned} |\nu _r|\{[a,b]^c\} \ \le \ 2\bar{A}_r \exp \{- c_{r\lambda }\sqrt{\lambda \log (\lambda +1)}/3\} \ \le \ 2\bar{A}_r(\lambda +1)^{-c_{r\lambda }/3}, \end{aligned}$$

since \(\lambda \ge \log (\lambda + 1)\) in \(\lambda \ge 0\). Hence, in either case, from the definition of \(c_{r\lambda }\), we have

$$\begin{aligned} |\nu _r|\{[a,b]^c\} \ \le \ 2\bar{A}_r K_{r\delta } (\lambda +1)^{-(r+1)/2}. \end{aligned}$$

Hence, from Corollary 4.1(3), with \(\varepsilon =\eta =0, \,M=1, \,\gamma _1 = K_{r\delta }\) and \(t_1=t\), it follows that

$$\begin{aligned}&\Vert P_X-\nu _r\Vert \le \left\{ 2c_{r\lambda }\sqrt{\lambda \log {(\lambda +1)}} +r+ 2 \right\} \alpha ^{\prime }_{1t} K_{r\delta } \lambda ^{-(t+1)/2}\\&\quad +\,\, \alpha ^{\prime }_{2t} K_{r\delta } \lambda ^{-t/2} + 4\bar{A}_r K_{r\delta } \lambda ^{-(r+1)/2}, \end{aligned}$$


$$\begin{aligned} \alpha ^{\prime }_{1t} \ :=\ \alpha _{1t}(\pi ^2/2)^{(t+1)/2};\quad \alpha ^{\prime }_{2t} \ :=\ \alpha _{2t}(\pi ^2/2)^{t/2}, \end{aligned}$$

so that

$$\begin{aligned} \Vert P_X-\nu _r\Vert \ \le \ \beta _{3t} K_{r\delta }\lambda ^{-t/2}\sqrt{\log (\lambda +1)} \max \left\{ 1,\frac{|\log K_{r\delta }|}{\log (\lambda +1)} \right\} , \end{aligned}$$

with \(\beta _{3t} := \alpha ^{\prime }_{1t}\{4r+11\} + \alpha ^{\prime }_{2t} + 4\bar{A}_r\).

For \(\lambda <1\), we take \(b := \lceil 2 + r + {3|\log K_{r\delta }|}\rceil \) in (5.5), giving

$$\begin{aligned} |\nu _r|\{[b,\infty )\} \ \le \ \bar{A}_r K_{r\delta }, \end{aligned}$$

and then, from Corollary 4.1(3) as above, it follows that

$$\begin{aligned} \Vert P_X-\nu _r\Vert \ \le \ (r+3+ 3|\log K_{r\delta }|)\alpha ^{\prime }_{1t} K_{r\delta } + \alpha ^{\prime }_{2t} K_{r\delta } + 2\bar{A}_r K_{r\delta }, \end{aligned}$$

so that

$$\begin{aligned} \Vert P_X-\nu _r\Vert \ \le \ \beta ^{\prime }_{3t} K_{r\delta } \max \left\{ 1,|\log (K_{r\delta }|\right\} \!, \end{aligned}$$

with \(\beta ^{\prime }_{3t} := \alpha ^{\prime }_{1t}\{r+6\} + \alpha ^{\prime }_{2t} + 2\bar{A}_r\). \(\square \)

Compound Poisson approximation

The theory of Sect. 3 can also be applied when the distributions \(R_\lambda \) come from a compound Poisson family. For \(\lambda > 0\) and for \(\mu \) a probability distribution on \(\mathbb{Z }\), let \(\mathrm{CP\,}(\lambda ,\mu )\) denote the distribution of the sum \(Y := \sum _{j\in \mathbb{Z }\setminus \{0\}} jZ_j\), where \(Z_j, \,j\ne 0\), are independent, and \(Z_j \sim \mathrm{Po\,}(\lambda \mu _j)\). Then, if \(\mu _1>0\), the characteristic function of \(Y\) is of the form \(R_\lambda := \zeta _{\lambda }p_{\lambda _1}\), where \(\zeta _{\lambda }\) is the characteristic function of \(\sum _{j\in \mathbb{Z }\setminus \{0,1\}} jZ_j\) and \(\lambda _1 = \lambda \mu _1\). Thus, for the purposes of applying Corollary 4.1, \(\rho \) can be taken to be \(2\pi ^{-2}\mu _1\lambda \).

These considerations apply as long as \(\mu _1 > 0\), and could also be invoked if \(\mu _{-1} > 0\). If \(\mu _1=\mu _{-1}=0\), there is then no factor of the form \(p_{\lambda }\) to guarantee that, for some \(\rho >0\), the characteristic function \(\phi _Y\) of \(Y\) is \((\rho ,\pi )\)-locally normal. Indeed, if \(Y = 2Z\) where \(Z\sim \mathrm{Po\,}(\lambda )\), and if \(W\sim \mathrm{Be\,}(1/2)\) is independent of \(Y\), it is not true that the distribution of \(Y + W\) is close to that of \(Y\) in total variation, even though \(|\phi _{Y+W}(\theta ) - \phi _Y(\theta )| \le K_0|\theta |\,|\phi _Y(\theta )|\); this is to be compared to the example in Sect. 2.


Sums of independent random variables

Let \(X_1,\ldots ,X_n\) be independent integer valued random variables, and let \(S_n\) denote their sum. In contexts in which a central limit approximation to the distribution of \(S_n\) would be appropriate, the classical Edgeworth expansion (see, e.g., [7, Chapter 5]) is unwieldy, because \(S_n\) is confined to the integers. As an alternative, Barbour and Čekanavičius [1, Theorem 5.1] give a Poisson–Charlier expansion, for \(S_n\) ‘centred’ so that its mean and variance are almost equal. If the \(X_i\) have variances that are uniformly bounded below and have bounded \((r+1+\delta )\)-th moments, and if the distribution of each \(X_i\) has non-trivial overlap with that of \(X_i+1\), their error bound with respect to the total variation norm is of order \(O(n^{-(r-1+\delta )/2})\). Here, under similar conditions, we use Corollary 4.1 to prove an error bound for their expansion which is of the same order, but is established only with respect to the less stringent Kolmogorov distance. A total variation bound for the error, of the slightly larger order \(O(n^{-(r-1+\delta )/2}\sqrt{\log n})\), could be deduced from Corollary 4.1(3), by taking \(a = \lfloor \lambda - k\sqrt{\lambda \log \lambda } \rfloor \) and \(b = \lceil \lambda + k\sqrt{\lambda \log \lambda } \rceil \), for suitable choice of \(k= k_r\), where \(\lambda = \mathbb{E }S_n\) (and \(\mathbb{E }S_n \approx \mathrm{Var\,}S_n\), because of centring).

Assume that each of the \(X_j\) has finite \((r+1+\delta )\)-th moment, with \(r\ge 1\), and define

$$\begin{aligned} A^{(r)}(w) \ :=\ 1 + \sum _{l\ge 2}\tilde{a}_l^{(r)}w^l \ =\ \exp \left\{ \sum _{l=2}^{r+1} \frac{\kappa _l w^l}{l!}\right\} , \end{aligned}$$

where \(\kappa _l := \kappa _l(S_n)\) and \(\kappa _l(X)\) denotes the \(l\)-th factorial cumulant of the random variable \(X\). Then the approximation that we establish is to the Poisson–Charlier signed measure \(\nu _r\) with

$$\begin{aligned} \nu _r\{j\} \ :=\ \mathrm{Po\,}(\lambda )\{j\}\left\{ 1 + \sum _{l=2}^{L_r} (-1)^l \tilde{a}_l^{(r)}C_l(j;\lambda )\right\} , \end{aligned}$$

where \(L_r := \max \{1,3(r-1)\}\), and where \(\lambda := \mathbb{E }S_n; \,\nu _r\) has characteristic function

$$\begin{aligned} \phi _{\nu _r} \ :=\ p_\lambda (\theta )\,{\widetilde{A}}^{(r)}(\theta ), \end{aligned}$$


$$\begin{aligned} {\widetilde{A}}^{(r)}(\theta ) \ :=\ 1 + \sum _{l=2}^{L_r} \tilde{a}_l^{(r)}(e^{i\theta }-1)^l. \end{aligned}$$

We need two further quantities involving the \(X_j\):

$$\begin{aligned} K^{(n)}\ :=\ \left| \sum _{j=1}^n \kappa _2(X_j) \right| \ =\ |\mathrm{Var\,}S_n - \mathbb{E }S_n|, \end{aligned}$$

kept small by judicious centring, and

$$\begin{aligned} p_j \ :=\ 1 - {\textstyle \frac{1}{2}}\Vert \mathcal{L }(X_j) - \mathcal{L }(X_j+1)\Vert . \end{aligned}$$

Theorem 7.1

Suppose that there are constants \(K_{l}, \,1\le l\le r+1\), such that, for each \(j\),

$$\begin{aligned} |\kappa _l(X_j)| \ \le \ K_{l},\quad 2\le l\le r+1;\qquad \mathbb{E }|X_j|^{r+1+\delta } \ \le \ K_{1}^{r+1+\delta }. \end{aligned}$$

Suppose also that \(p_j \ge p_0 > 0\) for all \(j\), and that \(\lambda \ge n\lambda _0\). Then

$$\begin{aligned} d_{\mathrm{K}}(\mathcal{L }(S_n),\nu _r) \ \le \ G(K_{1},\ldots ,K_{r+1},K^{(n)},p_0^{-1},\lambda _0^{-1}) n^{-(r-1+\delta )/2}, \end{aligned}$$

for a function \(G\) that is bounded on compact sets.


For asymptotics in \(n\), with triangular arrays of variables, the error is of order \(O(n^{-(r-1+\delta )/2})\) when \(\lambda _0\) and \(p_0\) are bounded away from zero, and \(K_{1},\ldots ,K_{r+1}\) and \(K^{(n)}\) remain bounded. The requirements on \(\lambda _0\) and \(p_0\) can often be achieved by grouping the random variables appropriately, though attention then has to be paid to the consequent changes in the \(K_{l}\). The condition (7.5) can always be satisfied with \(K^{(n)}\le 1\), by replacing the \(X_j\) by translates, where necessary. For more discussion, we refer to [1]. The above conditions are designed to cover sums of independent random variables, each of which has non-trivial variance, has uniformly bounded \((r+1+\delta )\)-th moment, and whose distribution overlaps with its unit translate.


We check the conditions of Corollary 4.1(2). First, in view of (7.6), we can write

$$\begin{aligned} \mathbb{E }(e^{i\theta X_j}) \ =\ {\textstyle \frac{1}{2}}p_j (e^{i\theta }+1)\phi _{1j}(\theta ) + (1-p_j)\phi _{2j}(\theta ), \end{aligned}$$

where both \(\phi _{1j}\) and \(\phi _{2j}\) are characteristic functions. Hence we have

$$\begin{aligned} \left| \mathbb{E }(e^{i\theta X_j})\right| \ \le \ 1-p_j + p_j\cos (\theta /2) \ \le \ 1 - p_j \theta ^2/4\pi , \quad 0 \le |\theta | \le \pi . \end{aligned}$$

Hence \(\phi _\mu (\theta ) := \mathbb{E }(e^{i\theta S_n})\) satisfies

$$\begin{aligned} |\phi _\mu (\theta )| \ \le \ \exp \{-np_0\theta ^2/4\pi \}, \quad 0 \le |\theta | \le \pi . \end{aligned}$$

On the other hand, from the additivity of the factorial cumulants, we have

$$\begin{aligned} |\kappa _l(S_n)| \ \le \ nK_{l}, \quad 3\le l\le r+1, \end{aligned}$$

with \(|\kappa _2(S_n)| \le K^{(n)}\) from (7.5). From (7.1), we thus deduce the bound \(|\tilde{a}_l^{(r)}| \le c_l n^{\lfloor l/3\rfloor }\), for \(c_l = c_l(K^{(n)},K_{3},\ldots ,K_{r+1}), \,l\ge 1\). Hence

$$\begin{aligned} |\phi _{\nu _r}(\theta )| \ \le \ \exp \{-2n\lambda _0\theta ^2/\pi ^2\}c^{\prime }n^{\lfloor L_r/3 \rfloor } \ \le \ \exp \{-n\lambda _0\theta ^2/\pi ^2\}c^{\prime \prime }, \end{aligned}$$

for \(c^{\prime \prime }=c^{\prime \prime }(K^{(n)},K_{3},\ldots ,K_{r+1})\). Combining (7.7) and (7.8), we can thus take \(\eta := C e^{-n\rho ^{\prime }\theta _0^2}\) in (3.2), for

$$\begin{aligned} \rho ^{\prime } = \min \{\lambda _0/\pi ^2,p_0/4\pi \} \end{aligned}$$

and a suitable \(C = C(K^{(n)},K_{3},\ldots ,K_{r+1})\). The choice of \(\theta _0\) we postpone for now.

For \(|\theta | \le \theta _0\), we take \(\chi (\theta ) := p_\lambda (\theta )\), and check the approximation of

$$\begin{aligned} \psi _\mu (\theta ) \ :=\ \phi _\mu (\theta )\exp \{-\lambda (e^{i\theta }-1)\} \ =\ \mathbb{E }\left\{ (1 + w)^{S_n} \right\} e^{-w\mathbb{E }S_n} \end{aligned}$$

by \({\widetilde{A}}^{(r)}(\theta )\) as a polynomial in \(w := e^{i\theta }-1\). We begin with the inequality

$$\begin{aligned} \left| (1 + w)^{s} - \sum _{l=0}^{r+1} \frac{w^l}{l!}\, s_{(l)}\right|&\le \frac{|s_{(r+2)}|}{(r+2)!}\,|w|^{r+2} \wedge 2\frac{|s_{(r+1)}|}{(r+1)!}\,|w|^{r+1}\\&\le \frac{|s_{(r+1)}|}{(r+2)!}\,|w|^{r+1+\delta }\{|s|+r+1\}^\delta \{2(r+2)\}^{1-\delta }, \end{aligned}$$

derived using Taylor’s expansion, true for any \(s\in \mathbb{Z }\) and \(0 < \delta \le 1\), where \(s_{(l)} := s(s-1)\ldots (s-l+1)\). Hence, for each \(j\), we have

$$\begin{aligned} \left| \mathbb{E }\left\{ (1 + w)^{X_j} \right\} - \sum _{l=0}^{r+1} \frac{\mathbb{E }\{(X_j)_{(l)}\}}{l!}\, w^l \right| \ \le \ c_{r,\delta } |\theta |^{r+1+\delta }(K_{1} + K_{1}^{r+1+\delta }), \end{aligned}$$

for a universal constant \(c_{r,\delta }\). Then, writing

$$\begin{aligned} Q^{(s)}_{r+1}(w;X) \ :=\ \exp \left\{ \sum _{l=s}^{r+1}\kappa _l(X)w^l/l! \right\} , \end{aligned}$$

and using the differentiation formula in [7, p. 170], we have

$$\begin{aligned}&\left| Q^{(1)}_{r+1}(w;X_j) - \sum _{l=0}^{r+1} \frac{\mathbb{E }\{(X_j)_{(l)}\}}{l!}\, w^l\right| \nonumber \\&\quad \le \frac{|\theta |^{r+2}}{(r+2)!}\,\sup _{|\theta ^{\prime }| \le \theta _0} \left| \frac{d^{r+2}}{dz^{r+2}} Q^{(1)}_{r+1}(z;X_j) \right| _{z=e^{i\theta ^{\prime }}-1}\nonumber \\&\quad \le |\theta |^{r+2} c(K_{1},\ldots ,K_{r+1}), \end{aligned}$$

for a suitable function \(c\) and for all \(|\theta | \le \pi \). Combining these estimates, we deduce that, for \(w = e^{i\theta }-1\) and for all \(|\theta | \le \pi \),

$$\begin{aligned} \left| \mathbb{E }\left\{ (1 + w)^{X_j} \right\} e^{-\mathbb{E }X_j w} - Q^{(2)}_{r+1}(w;X_j)\right| \ \le \ k_1|\theta |^{r+1+\delta }, \end{aligned}$$

where \(k_1 = k_1(K_{1},\ldots ,K_{r+1})\).

Now a standard inequality shows that, for \(u_j := \prod _{l=1}^j x_l \prod _{l=j+1}^n y_l\), for complex \(x_l,y_l\) with \(y_l \ne 0\) and \(|x_l/y_l - 1| \le \varepsilon _l\), then

$$\begin{aligned} |u_n-u_0| \ \le \ |u_0| \left\{ \prod _{s=1}^{n-1}(1+\varepsilon _s) \right\} \sum _{l=1}^n \varepsilon _l. \end{aligned}$$

Taking \(x_j := \mathbb{E }\{ (1 + w)^{X_j} \}+ e^{-\mathbb{E }X_j w}\) and \(y_j := Q^{(2)}_{r+1}(w;X_j)\), (7.11) shows that we can take \(\varepsilon _l := \varepsilon := k_1|\theta |^{r+1+\delta }e^E\) for each \(l\), with

$$\begin{aligned} E := \exp \left\{ \sum _{l=2}^{r+1}K_{l}/l!\right\} , \end{aligned}$$

provided that \(|\theta | \le \theta _0 \le 1\). For \(r\ge 2\), choosing \(\theta _0 := n^{-1/3}\) then ensures that \((1+\varepsilon )^n\) is suitably bounded, and (7.12) yields

$$\begin{aligned} \left| \mathbb{E }\left\{ (1 + w)^{S_n} \right\} e^{-w\mathbb{E }S_n} - Q^{(2)}_{r+1}(w;S_n)\right| \ \le \ k_2 n|\theta |^{r+1+\delta }, \end{aligned}$$

for \(k_2 = k_2(K^{(n)},K_{1},\ldots ,K_{r+1})\), since

$$\begin{aligned} |u_0| \ :=\ |Q^{(2)}_{r+1}(w;S_n)| \ \le \ \exp \{|\kappa _2(S_n)|\theta _0^2/2\} \exp \left\{ \sum _{l=3}^{r+1}nK_{l} \theta _0^l/l! \right\} \end{aligned}$$

is bounded for \(\theta _0 = n^{-1/3}\), in view of (7.5). For \(r=1, \,|u_0|\) is uniformly bounded if \(\theta _0 \le 1\), and the choice \(\theta _0 = n^{-1/(2+\delta )}\) ensures that \((1+\varepsilon )^n\) remains bounded.

The remaining step is to note that, for \(w:=e^{i\theta }-1, \,{\widetilde{A}}^{(r)}(\theta )\) contains all terms up to the power \(w^{L_r}\) in the power series expansion of \(Q^{(2)}_{r+1}(w;S_n)\), giving

$$\begin{aligned} \left| Q^{(2)}_{r+1}(w;S_n) - {\widetilde{A}}^{(r)}(\theta )\right| \ \le \ \frac{|\theta |^{L_r+1}}{(L_r+1)!}\,\sup _{|\theta ^{\prime }| \le |\theta |} \left| \frac{d^{L_r+1}}{dz^{L_r+1}} Q^{(2)}_{r+1}(z;S_n) \right| _{z=e^{i\theta ^{\prime }}-1}.\nonumber \\ \end{aligned}$$

Now \(|\kappa _2(S_n)|\) is bounded by \(K^{(n)}\), and, for \(l\ge 3\), each \(\kappa _l(S_n)\), for which we have only the weak bound \(nK_{l}\), occurs associated with the power \(w^l\) in the exponent of \(Q^{(2)}_{r+1}(w;S_n)\). Writing

$$\begin{aligned} \frac{d^s}{dz^s} Q^{(2)}_{r+1}(z;S_n) \ =\ P_s(n,z) Q^{(2)}_{r+1}(z;S_n), \end{aligned}$$

the monomials that make up \(P_s(n,z)\) thus have coefficients of magnitude \(n^l\) associated with powers \(z^m\) with \(m \ge (2l - (s-l))_+ = (3l-s)_+\), so that they are themselves of magnitude at most \(O(n^{l - (3l-s)_+/3}) = O(n^{s/3})\) in \(|\theta ^{\prime }| \le n^{-1/3}\). Taking \(s=L_r+1\) and \(r\ge 2, \,m=0\) requires that \(l \le r-1\), and \(l\ge r\) entails \(m\ge 2\), so that, for \(r\ge 2\) and \(|\theta |\le \theta _0\),

$$\begin{aligned} \sup _{|\theta ^{\prime }| \le |\theta |} \left| \frac{d^{L_r+1}}{dz^{L_r+1}} Q^{(2)}_{r+1}(z;S_n) \right| _{z=e^{i\theta ^{\prime }}-1} \ \le \ k_3 n^{r-1}(1 + n|\theta |^2), \end{aligned}$$

with \(k_3 = k_3(K^{(n)},K_{1},\ldots ,K_{r+1})\). If \(|\theta | \ge n^{-1/2}\), it follows that the bound in (7.14) is at most \(2k_3\{(L_r+1)!\}^{-1} n^r |\theta |^{3r}\); if \(|\theta | \le n^{-1/2}\), the bound is at most \(2k_3\{(L_r+1)!\}^{-1} n |\theta |^{r+2}\). Combining this with (7.13), we have established that for \(|\theta |\le n^{-1/3}\) and \(r\ge 2\), we have

$$\begin{aligned} |\phi _\mu (\theta )\exp \{-\lambda (e^{i\theta }-1)\} - {\widetilde{A}}^{(r)}(\theta )| \ \le \ k_4 n|\theta |^{r+1+\delta }(1 + (n|\theta |^2)^{r-1}),\qquad \quad \end{aligned}$$

where \(k_4 = k_4(K^{(n)},K_{1},\ldots ,K_{r+1})\). This shows that \(\phi _\mu \) and \(\phi _{\nu _r}\) are \((0,\eta ,\theta _0)\)-mod \(p_\lambda \) polynomially close, with

$$\begin{aligned} M = 2,\quad \gamma _1 = nk_4,\quad t_1=r+1+\delta ,\quad \gamma _2 = n^{r}k_4,\quad t_2 = 3r-1 + \delta , \end{aligned}$$

and with \(\theta _0 = n^{-1/3}\) and \(\eta = Ce^{-n^{1/3}\rho ^{\prime }}\), this last from the bounds (7.7) and (7.8). Applying Corollary 4.1(2), taking \(a = 0\) and \(b=2\lambda \), and using the tail properties of the Poisson–Charlier measures (5.5), the theorem follows for \(r\ge 2\).

For \(r=1\), the bound in (7.14) is easily of order \(|\theta |^2\), giving a bound in (7.15) of \(k^{\prime }_4(n|\theta |^{2+\delta } + |\theta |^2)\). This leads to the choices

$$\begin{aligned} M = 2,\quad \gamma _1 = nk^{\prime }_4,\quad t_1=2+\delta ,\quad \gamma _2 = k^{\prime }_4,\quad t_2 = 2d, \end{aligned}$$

together with \(\theta _0 = n^{-1/(2+\delta )}\) and \(\eta = Ce^{-n^{\delta /(2+\delta )}\rho ^{\prime }}\), and the remainder of the proof is as before. \(\square \)

Analytic combinatorial schemes

An extremely interesting range of applications is to be found in the paper of Hwang [5]. His conditions are motivated by examples from combinatorics, in which generating functions are natural tools. He works in an asymptotic setting, assuming that \(X_n\) is a random variable whose probability generating function \(G_n\) is of the form

$$\begin{aligned} G_n(z) \ =\ z^h(g(z) + \varepsilon _n(z))e^{\lambda (z-1)}, \end{aligned}$$

where \(h\) is a non-negative integer, and both \(g\) and \(\varepsilon _n\) are analytic in a closed disc of radius \(\eta > 1\). As \(n\rightarrow \infty \), he assumes that \(\lambda \rightarrow \infty \) and that \(\sup _{z:|z|\le \eta }|\varepsilon _n(z)| \le K\lambda ^{-1}\), uniformly in \(n\). He then proves a number of results describing the accuracy of the approximation of \(P_{X_n-h}\) by \(\mathrm{Po\,}(\lambda + g^{\prime }(1))\).

Under his conditions, it is immediate that we can write

$$\begin{aligned} g(z) \ =\ \sum _{j\ge 0}g_j(z-1)^j {\quad \text{ and }\quad }\varepsilon _n(z) \ =\ \sum _{j\ge 0}\varepsilon _{nj}(z-1)^j \end{aligned}$$

for \(|z-1| < \eta \), with

$$\begin{aligned} |g_j| \ \le \ k_g (\eta -1)^{-j} {\quad \text{ and }\quad }|\varepsilon _{nj}| \ \le \ \lambda ^{-1}k_\varepsilon (\eta -1)^{-j} \end{aligned}$$

for all \(j\ge 0\). Hence \(X := X_n-h\) has characteristic function of the form \(\psi ^{(n)}p_{\lambda }\), where

$$\begin{aligned} \psi ^{(n)}(\theta ) \ =\ g(e^{i\theta }) + \varepsilon _n(e^{i\theta }), \end{aligned}$$

and thus, for any \(r\in \mathbb{N }_0\),

$$\begin{aligned} |\psi ^{(n)}(\theta ) - {\tilde{\psi }}^{(n)}_r(\theta )| \ \le \ K_{r1} |\theta |^{r+1},\quad |\theta | \le (\eta -1)/2, \end{aligned}$$

with \({\tilde{\psi }}^{(n)}_r\) defined as in (2.4), taking \(\tilde{a}^{(n)}_j = g_j+\varepsilon _{nj}\); note that the constant \(K_{r1}\) can indeed be taken to be uniform for all \(n\). Since also \(g\) and \(\varepsilon _n\) are both uniformly bounded on the unit circle, and since \({\tilde{\psi }}^{(n)}_r\) is bounded (uniformly in \(n\)) for \(|\theta | \le \pi \), it is clear that (7.18) can be extended to all \(|\theta | \le \pi \), albeit with a different uniform constant \(K^{\prime }_{r1}\), so that (2.5) holds with \(\delta =1\) for any \(r\in \mathbb{N }_0\). Thus Parts 1–3 of Corollary 4.1 (with \(R_\lambda =\mathrm{Po\,}(\lambda )\) and \(\rho (\lambda )=2\lambda /\pi ^2\)) can be applied with any choice of \(r\), giving progressively more accurate approximations to \(P_{X_n-h}\), as far as the \(\lambda \)-order is concerned, in terms of progressively more complicated perturbations of the Poisson distribution. These theorems are thus applicable to all the examples that Hwang considers, including the numbers of components (counted in various ways) in a wide class of logarithmic assemblies, multisets and selections.

For instance, using translated Poisson approximation as in Sect. 4.2 by way of Proposition 4.2 gives an approximation to \(P_{X_n-h}\) by the mixture \(Q_{mp}(\mathrm{Po\,}(\lambda ^{\prime }))\), where, from (4.9),

$$\begin{aligned} m \ :=\ \lfloor m_n - v_n \rfloor ;\quad p^2 \ :=\ \langle m_n-v_n \rangle ;\quad \lambda ^{\prime } \ :=\ \lambda + v_n - p(1-p), \end{aligned}$$

where \(m_n := g_n^{\prime }(1), \,v_n := g_n^{\prime \prime }(1) + g_n^{\prime }(1) - \{g_n^{\prime }(1)\}^2\) and \(g_n := g + \varepsilon _n\). Hwang’s approximation by \(\mathrm{Po\,}(\lambda + g^{\prime }(1))\) has asymptotically the same mean as ours (and as that of \(X_n-h\)), but a variance asymptotically differing by \(\kappa := g^{\prime \prime }(1) - \{g^{\prime }(1)\}^2\). As a consequence, Hwang’s approximation has an error of larger asymptotic order, in which the quantity \(\kappa \) appears; for instance, for Kolmogorov distance, his Theorem 1 gives an error of order \(O(\lambda ^{-1})\), whereas that obtained using Corollary 4.1(2) together with Proposition 4.2 is of order \(O(\lambda ^{-3/2})\).

Although our Poisson expansion theorems are automatically applicable under Hwang’s conditions, they also apply to examples that do not satisfy his conditions: the simple example at the end of Sect. 2 is one such. Conversely, Hwang’s Theorem 2, which establishes Poisson approximation in the lower tail with good relative accuracy, cannot be proved using only our conditions; the conclusion would not be true, for instance, in the example just mentioned.

Note also that Hwang examines problems from combinatorial settings in which approximation is not by Poisson distributions: he has examples concerning the (amenable) Bessel family of distributions,

$$\begin{aligned} B(\lambda )\{j\} \ :=\ L(\lambda )^{-1}\frac{\lambda ^j}{j!(j-1)!},\quad j\in \mathbb{N }, \end{aligned}$$

for the appropriate choice of normalizing constant \(L(\lambda )\). Thus we could apply Corollary 4.1 to obtain asymptotically more accurate expansions, and, in conjunction with Proposition 4.2, obtain slightly sharper approximations than his within the translated Bessel family.

Prime divisors

The numbers of prime divisors of a positive integer \(n\), counted either with (\(\Omega (n)\)) or without (\(\omega (n)\)) multiplicity, can also be treated by these methods, since excellent information is available about their generating functions. For our purposes, we use only the shortest expansion, taken from [11, Theorems II.6.1 and 6.2]. One finds that, for \(N_n\) uniformly distributed on \(\{1,2,\ldots ,n\}\), the characteristic functions of \(\Omega (n)\) and \(\omega (n)\) are given by

$$\begin{aligned} \mathbb{E }\{e^{i\theta \omega (N_n)}\}&= p_{\log \log n}(\theta )\left\{ \Phi _1(e^{i\theta }-1)+ \varepsilon _1(\theta )\right\} ;\\ \mathbb{E }\{e^{i\theta \Omega (N_n)}\}&= p_{\log \log n}(\theta )\left\{ \Phi _2(e^{i\theta }-1) + \varepsilon _2(\theta )\right\} , \end{aligned}$$

where \(|\varepsilon _s(\theta )| \le C_s/\log n, \,s=1,2\), for some constants \(C_1\) and \(C_2\), and

$$\begin{aligned} \Phi _1(w)&:= \frac{1}{\Gamma (1+w)} \prod _q \Bigl (1 + \frac{w}{q}\Bigr )\, \Bigl (1 - \frac{1}{q} \Bigr )^w;\\ \Phi _2(w)&:= \frac{1}{\Gamma (1+w)} \prod _q \Bigl (1 - \frac{w}{q-1} \Bigr )^{-1} \, \Bigl (1 - \frac{1}{q} \Bigr )^w, \end{aligned}$$

\(q\) running here over prime numbers. These expansions were established and used by Rényi and Turán [9] in their proof of the Erdős–Kac Theorem, but they are also sketched by Selberg [10].

Kowalski and Nikeghbali [6] have emphasized the structural interpretation of these functions, which we now recall. Write

$$\begin{aligned} \Phi _{1,1}(\theta )=\frac{1}{\Gamma (e^{i\theta })},\quad \Phi _{1,2}(\theta )=\prod _{q}{\left( 1+\frac{e^{i\theta }-1}{q}\right) \left( 1-\frac{1}{q}\right) ^{e^{i\theta }-1}}, \end{aligned}$$

so that \(\Phi _1(e^{i\theta }-1)=\Phi _{1,1}(\theta )\Phi _{1,2}(\theta )\).

Let \(X_n\) be the random variable giving the number of disjoint cycles appearing in the decomposition of a random uniformly distributed permutation of size \(n\). In addition, let \(Y_n\) be a random variable of the form

$$\begin{aligned} Y_n=\sum _{q\le n}{B_q} \end{aligned}$$

where the \(B_q\) are independent Bernoulli random variables indexed by primes, with \(\mathbb{P }[B_q=1]=1/q; \,Y_n\) represents a naive model of the number of prime divisors \(\le n\) of a large integer.

Then we have

$$\begin{aligned} \mathbb{E }\{e^{i\theta X_n}\}\sim p_{\log n}(\theta ) \Phi _{1,1}(\theta ), \end{aligned}$$


$$\begin{aligned} \mathbb{E }\{e^{i\theta Y_n}\}\sim p_{\log \log n}(\theta ) \Phi _{1,2}(\theta ). \end{aligned}$$

This suggests an interpretation of the Rényi–Turán formula as a probabilistic decomposition of \(\omega (N_n)\) in terms of random permutations of size \(\log n\) and the naive divisibility model for integers, with an intricate dependency structure. We note that in the setting of polynomials over finite fields, this interpretation was shown by Kowalski and Nikeghbali [6] to have a precise meaning and to be very useful.

We come back to the application of our results to \(\omega (N_n)\) and \(\Omega (N_n)\). Let \(\tilde{a}_{ls}, \,s=1,2\), denote the Taylor coefficients of the functions \(\Phi _s(w)\) as power series in \(w\) (around \(w=0\), which corresponds to \(\theta =0\)). By analyticity near \(0\), it follows that, for any \(r\), we have

$$\begin{aligned} \left| \Phi _s(w) \!-\! 1 \!-\! \sum _{l=1}^r\tilde{a}_{ls} w^l \right| \ \le \ C_{rs}|w|^{r\!+\!1}; \left| \Phi ^{\prime \prime }_s(w) \!-\! \sum _{l=2}^r\tilde{a}_{ls} l(l-1) w^{l-2} \right| \!\ \le \ \! C^{\prime }_{rs}|w|^{r-1}, \end{aligned}$$

for suitable constants \(C_{rs}, C^{\prime }_{rs}\) and for \(|w|\le 2\). In order to approximate the distributions \(P_{\omega (N_n)}\) and \(P_{\Omega (N_n)}\), we define the measures \(\nu _r^{(s)}\) by

$$\begin{aligned} \nu _r^{(s)}\{j\} \ :=\ \mathrm{Po\,}(\log \log n)\{j\}\left( 1 + \sum _{l=1}^r(-1)^l \tilde{a}_{ls} C_l(j;\log \log n) \right) , \end{aligned}$$

and invoke Corollary 4.1 with \(M=1, \,\theta _0=\pi \) and \(\varepsilon = C_s/\log n\), together with (3.30); this leads to the following conclusion, which refines the Erdős–Kac theorem.

Theorem 7.2

For the measures \(\nu _r^{(s)}\) defined above, we have

$$\begin{aligned} d_{\mathrm{loc}}(P_{\omega (N_n)},\nu _r^{(1)})&\le \alpha ^{\prime }_{1,r+1} C_{r1}(\log \log n)^{-1-r/2} + \tilde{\alpha }_1 C_1/\log n;\\ \Vert P_{\omega (N_n)} - \nu _r^{(1)}\Vert&\le 2\alpha ^{\prime }(r+1,\pi ^2/2) C^{\prime }_{r1}\Bigl (1+\frac{2}{r} \Bigr )(\log \log n)^{-(r+1)/2}\\&+ {\widetilde{C}}_1\log \log n/\log n;\\ d_{\mathrm{loc}}(P_{\Omega (N_n)},\nu _r^{(2)})&\le \alpha ^{\prime }_{1,r+1} C_{r2}(\log \log n)^{-1-r/2} + \tilde{\alpha }_1 C_2/\log n;\\ \Vert P_{\Omega (N_n)} - \nu _r^{(2)}\Vert&\le 2\alpha ^{\prime }(r+1,\pi ^2/2) C^{\prime }_{r2}\Bigl (1+\frac{2}{r} \Bigr )(\log \log n)^{-(r+1)/2}\\&+ {\widetilde{C}}_2\log \log n/\log n, \end{aligned}$$

for suitable constants \({\widetilde{C}}_1\) and \({\widetilde{C}}_2\), and with \(\alpha ^{\prime }_{1l}\) as defined in (5.9).


As far as we know, total variation approximation was first considered in this context by Harper [4], who proved a bound with error of size \(1/(\log \log n)\) (for a truncated version of \(\omega (n)\), counting only prime divisors of size up to \(n^{1/(3(\log \log n)^2)}\)), and deduced explicit bounds in Kolmogorov distance.

To indicate what this means in concrete terms for number theory readers, consider the case of \(\omega (n)\) for \(r=1\). Taylor expansion gives

$$\begin{aligned} \Phi _1(w) \ =\ 1 + B_1w + O(w^2) \end{aligned}$$

as \(w\rightarrow 0\), where \(B_1\approx 0.26149721\) is the Mertens constant, i.e., the real number such that

$$\begin{aligned} \sum _{\mathop {q\le x}\limits _{q\,\mathrm{prime}}}{\frac{1}{q}}=\log \log x+B_1+o(1), \end{aligned}$$

as \(x\rightarrow +\infty \). An application of Theorem 7.2 with \(r=1\) gives

$$\begin{aligned} \left| \frac{1}{n}|\{k\le n\,\mid \, \omega (n)\in A\}| - \nu _1^{(1)}\{A\}\right|&\ \le {\textstyle \frac{1}{2}}\Vert P_{\omega (N_n)} - \nu _1^{(1)}\Vert \\&\ =O\Bigl (\frac{1}{\log \log n}\Bigr ), \end{aligned}$$

for any set \(A\) of positive integers, where

$$\begin{aligned} \nu _1^{(1)}\{j\} \ =\ \mathrm{Po\,}(\log \log n)\{j\} \Bigl (1 - B_1\left\{ 1 - \frac{j}{\log \log n}\right\} \Bigr ). \end{aligned}$$

Higher expansions could be computed in much the same way.

Alternatively, a more accurate approximation is available from Theorem 7.2 with \(r=2\), while staying within the realm of (translated) Poisson distributions, by invoking Proposition 4.2. For this, we compute the expansion of \(\Phi _1\) to order \(2\), obtaining (after some calculations) that

$$\begin{aligned} \Phi _1(w) \ =\ 1+\tilde{a}_1w+ \tilde{a}_2w^2 + O(w^3),\quad \text{ as } w\rightarrow 0, \end{aligned}$$


$$\begin{aligned} \tilde{a}_1 \ :=\ B_1;\quad \tilde{a}_2 \ :=\ \frac{B_1^2}{2} - \frac{\pi ^2}{12} - \frac{1}{2} \sum _{q \text{ prime }}{\frac{1}{q^2}} \end{aligned}$$

(use \(1/\Gamma (1+w)=1+\gamma w+(\gamma ^2/2-\pi ^2/12)w^2+O(w^3)\), as well as the Mertens identity

$$\begin{aligned} \gamma +\sum _{q \text{ prime }}{\left( \frac{1}{q}+ \log \left( 1-\frac{1}{q}\right) \right) }= B_1, \end{aligned}$$

and expand every term in the Euler product). This corresponds to (2.5), since \(w=e^{i\theta }-1\).

We can then apply Theorem 7.2 and Proposition 4.2 to yield the translated Poisson approximation \(Q_{mp}(\mathrm{Po\,}(\lambda ^{\prime }))\), with \(\lambda ^{\prime }, m\) and \(p\) found from (4.9). With

$$\begin{aligned} x \ :=\ \tilde{a}_1^2 - 2\tilde{a}_2 \ =\ \frac{\pi ^2}{6}+\sum _{q \text{ prime }} {\frac{1}{q^2}}\ \approx \ 2.0971815, \end{aligned}$$

this gives

$$\begin{aligned} p&= \sqrt{\langle x \rangle }\ \approx \ 0.31173945; \quad m \ =\ 2;\\ \lambda ^{\prime }&= \log \log n + B_1 - x - p(1-p)\ \approx \ \log \log n - 2.0502422. \end{aligned}$$

Thus, for any positive integer \(n\) and any set \(A\) of positive integers, we have

$$\begin{aligned}&\left| \frac{1}{n}|\{k\le n\,\mid \, \omega (n)\in A\}| - \{p\mathrm{Po\,}(\lambda ^{\prime })\{A-3\} + (1-p)\mathrm{Po\,}(\lambda ^{\prime })\{A-2\}\}\right| \\&\quad \ =\ O\Bigl (\frac{1}{(\log \log n)^{3/2}}\Bigr ). \end{aligned}$$

Similar results hold for \(\Omega (n)\), where one obtains the following approximate values for the quantities \(p,m,\lambda ^{\prime }\):

$$\begin{aligned} p \ \approx \ 0.5195; \quad m \ =\ 0; \quad \lambda ^{\prime } \ \approx \ \log \log n + 0.5152. \end{aligned}$$