Abstract
We build on the formalism developed in Ernst et al. (First order covariance inequalities via Stein’s method, 2019) to propose new representations of solutions to Stein equations. We provide new uniform and nonuniform bounds on these solutions (a.k.a. Stein factors). We use these representations to obtain representations for differences between expectations in terms of solutions to the Stein equations. We apply these to compute abstract Stein-type bounds on Kolmogorov, total variation and Wasserstein distances between arbitrary distributions. We apply our results to several illustrative examples and compare our results with current literature on the same topic, whenever possible. In all occurrences our results are competitive.
Similar content being viewed by others
References
Barbour, A.D., Holst, L., Janson, S.: Poisson Approximation. Clarendon Press, Oxford (1992)
Barbour, A.D., Xia, A.: On Stein’s Factors for Poisson Approximation in Wasserstein Distance, pp. 943–954. Bernoulli, Groningen (2006)
Baricz, Á.: Mills’ ratio: monotonicity patterns and functional inequalities. J. Math. Anal. Appl. 340(2), 1362–1370 (2008)
Cacoullos, T., Papathanasiou, V., Utev, S.A., et al.: Variational inequalities with examples and an application to the central limit theorem. Ann. Probab. 22(3), 1607–1618 (1994)
Chatterjee, S., Fulman, J., Röllin, A.: Exponential approximation by Stein’s method and spectral graph theory. ALEA 8, 197–223 (2011)
Chatterjee, S., Shao, Q.-M., et al.: Nonnormal approximation by Stein’s method of exchangeable pairs with application to the Curie–Weiss model. Ann. Appl. Probab. 21(2), 464–483 (2011)
Chen, L.H.: Poisson approximation for dependent trials. Ann. Probab. 3, 534–545 (1975)
Chen, L.H., Goldstein, L., Shao, Q.-M.: Normal approximation by Stein’s method. Springer, Berlin (2010)
Döbler, C.: Stein’s method of exchangeable pairs for absolutely continuous, univariate distributions with applications to the polya urn model. arXiv:1207.0533 (2012)
Döbler, C., et al.: Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Probab. 20, 1–34 (2015)
Döbler, C., Peccati, G.: The gamma Stein equation and noncentral de Jong theorems. Bernoulli 24(4B), 3384–3421 (2018)
Duembgen, L., Samworth, R., Wellner, J.: Bounding distributional errors via density ratios. arXiv:1905.03009 (2019)
Ehm, W.: Binomial approximation to the Poisson binomial distribution. Stat. Probab. Lett. 11(1), 7–16 (1991)
Erhardsson, T.: Steins method for Poisson and compound Poisson. Introd. Stein’s Method 4, 61 (2005)
Ernst, M., Reinert, G., Swan, Y.: First order covariance inequalities via Stein’s method. Submitted for publication (2019)
Ernst, M., Reinert, G., Swan, Y.: On infinite covariance expansions. arXiv:1906.08376 (2019)
Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002)
Goldstein, L., Reinert, G.: Distributional transformations, orthogonal polynomials, and Stein characterizations. J. Theor. Probab. 18(1), 237–260 (2005)
Goldstein, L., Reinert, G.: Stein’s method for the beta distribution and the Polya-Eggenberger urn. J. Appl. Probab. 50(04), 1187–1205 (2013)
Ley, C., Reinert, G., Swan, Y.: Distances between nested densities and a measure of the impact of the prior in Bayesian statistics. Ann. Appl. Probab. 27(1), 216–241 (2017)
Nourdin, I., Peccati, G.: Normal Approximations with Malliavin Calculus: from Stein’s Method to Universality, vol. 192. Cambridge University Press, Cambridge (2012)
Pinelis, I.: Exact bounds on the closeness between the Student and standard normal distributions. ESAIM Probab. Stat. 19, 24–27 (2015)
Acknowledgements
The research of YS was partially supported by the Fonds de la Recherche Scientifique—FNRS under Grants Nos. F.4539.16 and J.0197.20.F. ME acknowledges partial funding via a Welcome Grant of the Université de Liège. YS thanks Céline Esser for many fruitful discussions and Davy Paindaveine for bringing paper [12] to our attention. We also thank the referee whose remarks and suggestions helped us improve the exposition of our results (and also correct several imprecisions). Most of all, Marie and Yvik thank Gesine Reinert whose ideas are at the basis of references [15, 16], without which this paper could not have been written.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix A: Some More Proofs
Proof of Lemma 2.23
Introduce \( \Phi ^{\ell }_p(u, x, v) = {\chi ^{\ell }(u, x)\chi ^{-\ell }(x, v)}/{p(x)}\) for all \(x \in {\mathcal {S}}(p)\) and 0 elsewhere, which allows to perform “probabilistic integration” as follows: if \(f \in \mathrm {dom}(\Delta ^{-\ell })\) is such that \((\Delta ^{-\ell }f)\) is integrable on \([x_1, x_2] \cap {\mathcal {S}}(p)\), then
for all \(x_1 < x_2 \in {\mathcal {S}}(p)\). We can use this function to obtain
(we use the fact that \(\chi ^\ell (x,y)+\chi ^{-\ell }(y,x) =1+{\mathbb {I}}[\ell =0]{\mathbb {I}}[x=y]\)) and it only remains to reorganize the integrand to obtain the claim. To this end, we note how, by definition,
where the first identity is immediate by definition of \(\Phi _p^\ell \) and the last identity follows from the definition of the generalized indicator \(\chi ^{\ell }\). \(\square \)
Proof of Lemma 2.26
The expressions (2.21) and (2.22) of the solution g are direct from the definition of \({\mathcal {L}}_p^\ell \) and its representation (2.19). The first expression (2.26) of the derivative is direct from the expression (2.8). For the second claim, we shall first prove the following results:
We first prove (A.2). Starting from (2.7) and applying repeatedly (2.19) then (2.20) (once to h and once to \(\eta \)), we obtain
We now prove (A.3). By similar arguments as above, this follows from
To conclude, we decompose the above expectation into four parts with: \(X_i<x+{\mathbb {I}}[\ell = 1]\) and/or \(X_i\ge x+{\mathbb {I}}[\ell = 1]\), for \(i=1,2\) (i.e., using either \(\chi ^{-\ell }(X_i,x)\) or \(\chi ^{\ell }(x,X_i)\)). Therefore, by considering separately \(\ell \in \{0,-1,1\}\), we can easily verify that
and
Basic manipulations then give
which leads to the claim as \(\bar{P}(x+{\mathbb {I}}[\ell =1]) +P(x-{\mathbb {I}}[\ell = -1])=1\) and \(\ell ={\mathbb {I}}[\ell =1] -{\mathbb {I}}[\ell = -1]\). \(\square \)
Proof of Lemma 2.27
The condition implies that \(g^-\) is nondecreasing and nonnegative over \({\mathcal {S}}(p) \cap (-\infty , \xi ]\) and nondecreasing and nonpositive over \({\mathcal {S}}(p) \cap (\xi , \infty )\). Therefore, the absolute value of the solution for point mass Eq. (2.28) reaches his supremum at \(\xi \) or \(\xi +1\), which gives the bound (2.29). Moreover, the supremum of the difference is observed between \(\xi \) and \(\xi +1\). Using the explicit expression (2.17) and the relation \(\tau _p^\ell (x+\ell )p(x+\ell )=\tau _p^{-\ell }(x)p(x)\), we have
Furthermore, as \(x-{\mathbb {E}}[X] = \tau _p^+(x)-\tau _p^-(x)\), we have \(\tau ^-_p(\xi ) \ge \tau ^+_p(\xi )\) if \(\xi \le {\mathbb {E}}[X]\) (resp. \(\tau ^-_p(\xi ) \le \tau ^+_p(\xi )\) if \(\xi \ge {\mathbb {E}}[X]\)). Therefore, the supremum is bounded by \(\frac{P(\xi -1)+1-P(\xi )}{\tau _p^+(\xi )} =\frac{1-p(\xi )}{\tau _p^+(\xi )}\) if \(\xi \le {\mathbb {E}}[X]\) and otherwise by \(\frac{1-p(\xi )}{\tau _p^-(\xi )}\).
By remark 2.21, the solution \(g_A^\ell (x)\) is explicit and defined by \(g_\xi \) for \(\xi \in A\). The sign of \(g_\xi \) changes according to the relative position of \(\xi \) and x. Then, combined with the hypotheses, the maximal value of \(|g_A^-(x)|\) is either observed at \(x=\min _{\xi \in A}\{\xi \}=:\xi _{1}\) or \(x=\max _{\xi \in A}\{\xi \}+1=:\xi _{2}+1\). Then,
Finally, due to the monotonicity of each \(g_\xi (x)\) function, the maximal difference \(|\Delta g_A(x)|\) is bounded by the supremum of \(|\Delta g_\xi (x)|\) for \(\xi \in A\), which is enough to conclude. \(\square \)
Proof of Theorem 3.2
First take \(c_1(x)= c_2(x) = 1 \) in (3.4). Without any further assumptions on h, the solution \(g_h^*\) of (1.5) with \(c(x)=1\) can be represented as
Hence, we obtain (3.5).
Next take \(\eta _1 = \eta _2 = \mathrm {Id}\) in (3.3). Then, \(-{\mathcal {L}}_{\infty }^\ell \eta _1(x) = \tau _\infty ^\ell (x)\) and \(-{\mathcal {L}}_n^\ell \eta _2(x) = \tau _n^\ell (x)\), the Stein kernels of \(p_\infty \) and \(p_n\). Without any further assumptions on h, the solution \(g_h(x)\) of (1.4) with \(\eta =\mathrm {Id}\) can be represented as
Hence we get (3.6).
Appendix B: Some more inequalities
Corollary B.1
(Identity (3.6), Stein kernels and \(\ell = 0\)) Under the same assumptions and with exactly the same notations as in Corollary 3.4, the following results hold true.
-
1.
The Kolmogorov distance between the random variables \(X_n\) and \(X_\infty \) is
$$\begin{aligned}&\mathrm {Kol}(X_n,X_\infty ) \nonumber \\&\quad = \sup _z \Bigg | {\mathbb {E}}\Bigg [ \frac{\tau _n(X_n) -\tau _\infty (X_n)}{\tau _{\infty }(X_n)}{\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n) \end{aligned}$$(B.1)$$\begin{aligned}&\qquad \times \Bigg ( {P_{\infty }(z) - {\mathbb {I}}[X_n \le z]} + \frac{X_n -{\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \frac{P_{\infty }(X_n \wedge z) \bar{P}_{\infty }(X_n\vee z)}{p_{\infty }(X_n)} \Bigg ) \Bigg ] + \kappa _{\mathrm {Id}}(z) \Bigg | \nonumber \\&\quad \le {\mathbb {E}}\left[ \left| \frac{\tau _n(X_n)}{\tau _{\infty }(X_n)}-1 \right| \left( 1 + \frac{|X_n -{\mathbb {E}}[X_{\infty }]|}{\tau _{\infty }(X_n)} \frac{P_\infty (X_n)\bar{P}_\infty (X_n)}{p_\infty (X_n)} \right) {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n) \right] + \sup _z |\kappa _{\mathrm {Id}}(z)| \end{aligned}$$(B.2)where
$$\begin{aligned} \kappa _{\mathrm {Id}}(z)&= (\mu _n - \mu _{\infty }) {\mathbb {E}} \left[ \frac{P_{\infty }(X_n \wedge z) \bar{P}_{\infty }(X_n\vee z)}{\tau _{\infty }(X_n)p_{\infty }(X_n)} \right] \\&\quad + \lim _{x \nearrow b_n \wedge b_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)}P_\infty (x\wedge z)\bar{P}_\infty (x\vee z)\\&\quad -\lim _{x \searrow a_n \vee a_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)}P_\infty (x\wedge z)\bar{P}_\infty (x\vee z). \end{aligned}$$ -
2.
The total variation distance between \(X_n\) and \(X_\infty \) is
$$\begin{aligned}&\mathrm {TV}(X_n, X_{\infty })\nonumber \\&\quad = \kappa _{\mathrm {Id}}({\mathbb {I}}_{A_{n}^{\infty }}) +{\mathbb {E}}\Bigg [ \frac{\tau _n(X_n) - \tau _\infty (X_n)}{\tau _{\infty }(X_n)}{\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n) \end{aligned}$$(B.3)$$\begin{aligned}&\qquad \times \Bigg ({P}_{\infty }(A_n^{\infty }) - {\mathbb {I}}_{ A_n^{\infty }}(X_n)\nonumber \\&\qquad +\frac{X_n -{\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \frac{ {P}_\infty (A_{n}^{\infty }\cap (-\infty , X_n])-{P}_\infty (A_{n}^{\infty }) P_{\infty }(X_n)}{p_\infty (X_n)} \Bigg ) \Bigg ] \nonumber \\&\quad \le {\mathbb {E}}\left[ \left| \frac{\tau _n(X_n)}{\tau _{\infty }(X_n)}-1 \right| \left( 1 + \frac{|X_n -{\mathbb {E}}[X_{\infty }]|}{\tau _{\infty }(X_n)} \frac{P_\infty (X_n)\bar{P}_\infty (X_n)}{p_\infty (X_n)} \right) {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n)\right] + \kappa _{\mathrm {Id}}({\mathbb {I}}_{A_{n}^{\infty }}) \end{aligned}$$(B.4)with
$$\begin{aligned} \kappa _{\mathrm {Id}}({\mathbb {I}}_{A_{n}^{\infty }})&= \lim _{x \nearrow b_n \wedge b_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)} \left( {P}_\infty (A_{n}^{\infty }\cap (-\infty ,x]) -{P}_\infty (A_{n}^{\infty }) P_{\infty }(x)\right) \\&\quad - \lim _{x \searrow a_n \vee b_n}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)} \left( {P}_\infty (A_{n}^{\infty }\cap (-\infty ,x]) -{P}_\infty (A_{n}^{\infty }) P_{\infty }(x)\right) \\&\quad + (\mu _n-\mu _{\infty }) {\mathbb {E}} \left[ \frac{ {P}_\infty (A_{n}^{\infty } \cap (-\infty , X_n])-{P}_\infty (A_{n}^{\infty }) P_{\infty }(X_n)}{{\tau _{\infty }(X_n)} p_\infty (X_n)} \right] \end{aligned}$$ -
3.
The Wasserstein distance between \(X_n\) and \(X_\infty \) is
$$\begin{aligned} \mathrm {Wass}(X_n, X_{\infty })&= \sup _{h \in \mathrm {Lip}(1)} \Bigg | \kappa _{\mathrm {Id}}(h) \end{aligned}$$(B.5)$$\begin{aligned}&\quad + {\mathbb {E}}\left[ \frac{\tau _n(X_n) -\tau _\infty (X_n)}{\tau _{\infty }(X_n)} h'(X_\infty ) \right. \nonumber \\&\quad \left. \left( R_{\infty }(X_n, X_{\infty }) +\frac{X_n-{\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \tilde{K}_\infty (X_n,X_\infty )\right) {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n)\right] \Bigg | \nonumber \\&\quad \le 2 {\mathbb {E}}\left[ \left| \frac{\tau _n(X_n)}{\tau _{\infty }(X_n)}-1 \right| |X_n - {\mathbb {E}}[X_{\infty }]|{\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n)\right] + \sup _{h \in \mathrm {Lip}(1)} \kappa _{\mathrm {Id}}(h) \end{aligned}$$(B.6)where
$$\begin{aligned} \kappa _{\mathrm {Id}}(h)&= \lim _{x \searrow a_n \vee a_{\infty }} \frac{\tau _n(x)}{\tau _\infty (x)}\frac{p_n(x)}{p_{\infty }(x)} \int _{a_{\infty }}^{b_{\infty }} h'(u) {P_{\infty }(x \wedge u) \bar{P}_{\infty }(x \vee u) } \mathrm {d}u\\&\quad - \lim _{x \nearrow b_n \wedge b_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_{\infty }(x)} \int _{a_{\infty }}^{b_{\infty }}h'(u) {P_{\infty }(x \wedge u) \bar{P}_{\infty }(x \vee u) } \mathrm {d}u\\&\quad + (\mu _n - \mu _{\infty }) {\mathbb {E}}\left[ \frac{h'(X_\infty )}{\tau _{\infty }(X_{n})} \left( R_{\infty }(X_n, X_{\infty }) + \frac{X_n - {\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \tilde{K}_\infty (X_n, X_\infty )\right) \right] \end{aligned}$$
Corollary B.2
(Identity (3.6), Stein kernels, \(\ell = \pm 1\)) Under the same assumptions and with exactly the same notations as in Corollary 3.6, the following results hold true.
with
and
Rights and permissions
About this article
Cite this article
Ernst, M., Swan, Y. Distances Between Distributions Via Stein’s Method. J Theor Probab 35, 949–987 (2022). https://doi.org/10.1007/s10959-021-01075-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10959-021-01075-8
Keywords
- Stein’s method
- Stein equations
- Stein factors
- Kolmogorov distance
- Wasserstein distance
- Total variation distance
- Integral probability metrics