Skip to main content

Distances Between Distributions Via Stein’s Method

Abstract

We build on the formalism developed in Ernst et al. (First order covariance inequalities via Stein’s method, 2019) to propose new representations of solutions to Stein equations. We provide new uniform and nonuniform bounds on these solutions (a.k.a. Stein factors). We use these representations to obtain representations for differences between expectations in terms of solutions to the Stein equations. We apply these to compute abstract Stein-type bounds on Kolmogorov, total variation and Wasserstein distances between arbitrary distributions. We apply our results to several illustrative examples and compare our results with current literature on the same topic, whenever possible. In all occurrences our results are competitive.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Barbour, A.D., Holst, L., Janson, S.: Poisson Approximation. Clarendon Press, Oxford (1992)

    MATH  Google Scholar 

  2. Barbour, A.D., Xia, A.: On Stein’s Factors for Poisson Approximation in Wasserstein Distance, pp. 943–954. Bernoulli, Groningen (2006)

    MATH  Google Scholar 

  3. Baricz, Á.: Mills’ ratio: monotonicity patterns and functional inequalities. J. Math. Anal. Appl. 340(2), 1362–1370 (2008)

    MathSciNet  Article  Google Scholar 

  4. Cacoullos, T., Papathanasiou, V., Utev, S.A., et al.: Variational inequalities with examples and an application to the central limit theorem. Ann. Probab. 22(3), 1607–1618 (1994)

    MathSciNet  Article  Google Scholar 

  5. Chatterjee, S., Fulman, J., Röllin, A.: Exponential approximation by Stein’s method and spectral graph theory. ALEA 8, 197–223 (2011)

    MathSciNet  MATH  Google Scholar 

  6. Chatterjee, S., Shao, Q.-M., et al.: Nonnormal approximation by Stein’s method of exchangeable pairs with application to the Curie–Weiss model. Ann. Appl. Probab. 21(2), 464–483 (2011)

    MathSciNet  Article  Google Scholar 

  7. Chen, L.H.: Poisson approximation for dependent trials. Ann. Probab. 3, 534–545 (1975)

    MathSciNet  MATH  Google Scholar 

  8. Chen, L.H., Goldstein, L., Shao, Q.-M.: Normal approximation by Stein’s method. Springer, Berlin (2010)

    MATH  Google Scholar 

  9. Döbler, C.: Stein’s method of exchangeable pairs for absolutely continuous, univariate distributions with applications to the polya urn model. arXiv:1207.0533 (2012)

  10. Döbler, C., et al.: Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Probab. 20, 1–34 (2015)

    MathSciNet  Article  Google Scholar 

  11. Döbler, C., Peccati, G.: The gamma Stein equation and noncentral de Jong theorems. Bernoulli 24(4B), 3384–3421 (2018)

    MathSciNet  Article  Google Scholar 

  12. Duembgen, L., Samworth, R., Wellner, J.: Bounding distributional errors via density ratios. arXiv:1905.03009 (2019)

  13. Ehm, W.: Binomial approximation to the Poisson binomial distribution. Stat. Probab. Lett. 11(1), 7–16 (1991)

    MathSciNet  Article  Google Scholar 

  14. Erhardsson, T.: Steins method for Poisson and compound Poisson. Introd. Stein’s Method 4, 61 (2005)

    MathSciNet  Article  Google Scholar 

  15. Ernst, M., Reinert, G., Swan, Y.: First order covariance inequalities via Stein’s method. Submitted for publication (2019)

  16. Ernst, M., Reinert, G., Swan, Y.: On infinite covariance expansions. arXiv:1906.08376 (2019)

  17. Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002)

    Article  Google Scholar 

  18. Goldstein, L., Reinert, G.: Distributional transformations, orthogonal polynomials, and Stein characterizations. J. Theor. Probab. 18(1), 237–260 (2005)

    MathSciNet  Article  Google Scholar 

  19. Goldstein, L., Reinert, G.: Stein’s method for the beta distribution and the Polya-Eggenberger urn. J. Appl. Probab. 50(04), 1187–1205 (2013)

    MathSciNet  Article  Google Scholar 

  20. Ley, C., Reinert, G., Swan, Y.: Distances between nested densities and a measure of the impact of the prior in Bayesian statistics. Ann. Appl. Probab. 27(1), 216–241 (2017)

    MathSciNet  Article  Google Scholar 

  21. Nourdin, I., Peccati, G.: Normal Approximations with Malliavin Calculus: from Stein’s Method to Universality, vol. 192. Cambridge University Press, Cambridge (2012)

    Book  Google Scholar 

  22. Pinelis, I.: Exact bounds on the closeness between the Student and standard normal distributions. ESAIM Probab. Stat. 19, 24–27 (2015)

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

The research of YS was partially supported by the Fonds de la Recherche Scientifique—FNRS under Grants Nos. F.4539.16 and J.0197.20.F. ME acknowledges partial funding via a Welcome Grant of the Université de Liège. YS thanks Céline Esser for many fruitful discussions and Davy Paindaveine for bringing paper [12] to our attention. We also thank the referee whose remarks and suggestions helped us improve the exposition of our results (and also correct several imprecisions). Most of all, Marie and Yvik thank Gesine Reinert whose ideas are at the basis of references [15, 16], without which this paper could not have been written.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yvik Swan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 311 KB)

Appendices

Appendix A: Some More Proofs

Proof of Lemma 2.23

Introduce \( \Phi ^{\ell }_p(u, x, v) = {\chi ^{\ell }(u, x)\chi ^{-\ell }(x, v)}/{p(x)}\) for all \(x \in {\mathcal {S}}(p)\) and 0 elsewhere, which allows to perform “probabilistic integration” as follows: if \(f \in \mathrm {dom}(\Delta ^{-\ell })\) is such that \((\Delta ^{-\ell }f)\) is integrable on \([x_1, x_2] \cap {\mathcal {S}}(p)\), then

$$\begin{aligned} f(x_2)-f(x_1) = {\mathbb {E}}\left[ \Phi ^{\ell }_p(x_1, X, x_2) \Delta ^{-\ell }f(X)\right] \end{aligned}$$
(A.1)

for all \(x_1 < x_2 \in {\mathcal {S}}(p)\). We can use this function to obtain

$$\begin{aligned} \bar{h}(x)&={\mathbb {E}}\left[ (h(x)-h(X))(\chi ^\ell (X,x)+\chi ^{-\ell }(x,X)) \right] \\&= {\mathbb {E}}\left[ \Delta ^{-\ell } h(X_2) {\mathbb {E}}\left[ \Phi _p^\ell (X,X_2,x)-\Phi _p^\ell (x,X_2,X) | X_2\right] \right] \end{aligned}$$

(we use the fact that \(\chi ^\ell (x,y)+\chi ^{-\ell }(y,x) =1+{\mathbb {I}}[\ell =0]{\mathbb {I}}[x=y]\)) and it only remains to reorganize the integrand to obtain the claim. To this end, we note how, by definition,

$$\begin{aligned} {\mathbb {E}}\left[ \Phi _p^\ell (X,y,x)-\Phi _p^\ell (x,y,X) \right]&=\frac{\chi ^{-\ell }(y,x)}{p(y)} {\mathbb {E}}[\chi ^\ell (X,y)] - \frac{\chi ^{\ell }(x,y)}{p(y)}{\mathbb {E}}[\chi ^{-\ell }(y,X)] \\&= \chi ^{-\ell }(y, x) \frac{P(y - {\mathbb {I}}[\ell = 1]<)}{p(y)}\\&\quad - \chi ^{\ell }(x, y)\frac{\bar{P}(y+{\mathbb {I}}[\ell = -1]) }{p(y)} \end{aligned}$$

where the first identity is immediate by definition of \(\Phi _p^\ell \) and the last identity follows from the definition of the generalized indicator \(\chi ^{\ell }\). \(\square \)

Proof of Lemma 2.26

The expressions (2.21) and (2.22) of the solution g are direct from the definition of \({\mathcal {L}}_p^\ell \) and its representation (2.19). The first expression (2.26) of the derivative is direct from the expression (2.8). For the second claim, we shall first prove the following results:

$$\begin{aligned}&\Delta ^{-\ell }g(x) \nonumber \\&\quad = \frac{{\mathbb {E}} \left[ \tilde{K}_p^{\ell }(X_1, x+\ell ) R_p^{\ell }(x, X_2) \bigg ( \Delta ^{-\ell }\eta (X_2) \Delta ^{-\ell }h(X_1) - \Delta ^{-\ell } h(X_2) \Delta ^{-\ell } \eta (X_1) \bigg ) \right] }{\big (-{\mathcal {L}}_p^\ell \eta (x)\big )\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )} \end{aligned}$$
(A.2)
$$\begin{aligned}&\quad = \frac{{\mathbb {E}} \left[ \bigg (\tilde{K}_p^{\ell }(X_1, x+\ell ) R_p^{\ell }(x, X_2)- R_p^{\ell }(x, X_1)\tilde{K}_p^{\ell }(X_2, x+\ell ) \bigg ) \Delta ^{-\ell }h(X_1) \Delta ^{-\ell }\eta (X_2) \right] }{\big (-{\mathcal {L}}_p^\ell \eta (x)\big )\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )} \end{aligned}$$
(A.3)

We first prove (A.2). Starting from (2.7) and applying repeatedly (2.19) then (2.20) (once to h and once to \(\eta \)), we obtain

$$\begin{aligned}&\Delta ^{-\ell } g(x)\\&\quad =\frac{{\mathbb {E}} \left[ \tilde{K}_p^{\ell }(X_1, x+\ell ) \bigg ( \bar{\eta }(x) \Delta ^{-\ell }h(X_1)\big ) - \bar{h}(x) \Delta ^{-\ell } \eta (X_1) \bigg ) \right] }{\big (-{\mathcal {L}}_p^\ell \eta (x)\big )\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )} \\&\quad = \frac{{\mathbb {E}} \left[ \tilde{K}_p^{\ell }(X_1, x+\ell ) R_p^{\ell }(x, X_2) \bigg ( \Delta ^{-\ell }\eta (X_2) \Delta ^{-\ell }h(X_1)\big ) - \Delta ^{-\ell } h(X_2) \Delta ^{-\ell } \eta (X_1) \bigg ) \right] }{\big (-{\mathcal {L}}_p^\ell \eta (x)\big )\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )}. \end{aligned}$$

We now prove (A.3). By similar arguments as above, this follows from

$$\begin{aligned}&\Delta ^{-\ell } g(x) \\&\quad =\frac{{\mathbb {E}} \left[ \tilde{K}_p^{\ell }(X_1, x+\ell ) \bar{\eta }(x) \Delta ^{-\ell }h(X_1) \right] -\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big ) {\mathbb {E}}\left[ R_p^\ell (x,X_1)\Delta ^{-\ell }h(X_1)\right] }{\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )} \\&\quad = \frac{{\mathbb {E}} \left[ \bigg (\tilde{K}_p^{\ell }(X_1, x+\ell ) \bar{\eta }(x) - R_p^{\ell }(x, X_1)\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big ) \bigg ) \Delta ^{-\ell }h(X_1) \right] }{\big (-{\mathcal {L}}_p^\ell \eta (x)\big )\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )}\\&\quad = \frac{{\mathbb {E}} \left[ \bigg (\tilde{K}_p^{\ell }(X_1, x+\ell ) R_p^{\ell }(x, X_2)- R_p^{\ell }(x, X_1)\tilde{K}_p^{\ell }(X_2, x+\ell ) \bigg ) \Delta ^{-\ell }h(X_1)\Delta ^{-\ell }\eta (X_2) \right] }{\big (-{\mathcal {L}}_p^\ell \eta (x)\big ) \big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big )} . \end{aligned}$$

To conclude, we decompose the above expectation into four parts with: \(X_i<x+{\mathbb {I}}[\ell = 1]\) and/or \(X_i\ge x+{\mathbb {I}}[\ell = 1]\), for \(i=1,2\) (i.e., using either \(\chi ^{-\ell }(X_i,x)\) or \(\chi ^{\ell }(x,X_i)\)). Therefore, by considering separately \(\ell \in \{0,-1,1\}\), we can easily verify that

$$\begin{aligned} \tilde{K}_p^\ell (y,x+\ell ) = {\left\{ \begin{array}{ll} \dfrac{P(y-{\mathbb {I}}[\ell = 1])\bar{P}(x+{\mathbb {I}}[\ell = 1])}{p(y)p(x+\ell )}&{} \text{ if }\quad y<x+{\mathbb {I}}[\ell = 1] \\ \dfrac{P(x-{\mathbb {I}}[\ell = -1])\bar{P}(y+{\mathbb {I}}[\ell = -1])}{p(y)p(x+\ell )}&{} \text{ if }\quad y\ge x+{\mathbb {I}}[\ell = 1] \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} R_p^\ell (x,y) = {\left\{ \begin{array}{ll} \dfrac{P(y-{\mathbb {I}}[\ell = 1])}{p(y)}&{} \text{ if }\quad y<x+{\mathbb {I}}[\ell = 1] \\ \dfrac{-\bar{P}(y+{\mathbb {I}}[\ell = -1])}{p(y)}&{} \text{ if }\quad y\ge x+{\mathbb {I}}[\ell = 1] \end{array}\right. } \end{aligned}$$

Basic manipulations then give

$$\begin{aligned}&\Delta ^{-\ell } g(x) \big (-{\mathcal {L}}_p^\ell \eta (x)\big )\big (-{\mathcal {L}}_p^\ell \eta (x+\ell )\big ) =\frac{\bar{P}(x+{\mathbb {I}}[\ell = 1])+P(x-{\mathbb {I}}[\ell = -1])}{p(x+\ell )}\\&\quad \Bigg ({\mathbb {E}}\left[ \Delta ^{-\ell } h(X_1) \frac{\bar{P}(X_1+{\mathbb {I}}[\ell =-1])}{p(X_1)}\chi ^{\ell }(x,X_1)\right] \\&\qquad \times {\mathbb {E}}\left[ \Delta ^{-\ell } \eta (X_2) \frac{P(X_2-{\mathbb {I}}[\ell = 1])}{p(X_2)}\chi ^{-\ell }(X_2,x)\right] \\&\qquad - {\mathbb {E}}\left[ \Delta ^{-\ell } h(X_1) \frac{P(X_1-{\mathbb {I}}[\ell = 1])}{p(X_1)}\chi ^{-\ell }(X_1,x)\right] \\&\qquad \times {\mathbb {E}}\left[ \Delta ^{-\ell } \eta (X_2) \frac{\bar{P}(X_2+{\mathbb {I}}[\ell = -1])}{p(X_2)}\chi ^{\ell }(x,X_2)\right] \Bigg ) \end{aligned}$$

which leads to the claim as \(\bar{P}(x+{\mathbb {I}}[\ell =1]) +P(x-{\mathbb {I}}[\ell = -1])=1\) and \(\ell ={\mathbb {I}}[\ell =1] -{\mathbb {I}}[\ell = -1]\). \(\square \)

Proof of Lemma 2.27

The condition implies that \(g^-\) is nondecreasing and nonnegative over \({\mathcal {S}}(p) \cap (-\infty , \xi ]\) and nondecreasing and nonpositive over \({\mathcal {S}}(p) \cap (\xi , \infty )\). Therefore, the absolute value of the solution for point mass Eq. (2.28) reaches his supremum at \(\xi \) or \(\xi +1\), which gives the bound (2.29). Moreover, the supremum of the difference is observed between \(\xi \) and \(\xi +1\). Using the explicit expression (2.17) and the relation \(\tau _p^\ell (x+\ell )p(x+\ell )=\tau _p^{-\ell }(x)p(x)\), we have

$$\begin{aligned} \sup _x |\Delta g(x)|&= g^-(\xi ) - g^-(\xi +1) =\frac{P(\xi -1)}{\tau _p^+(\xi )} +\frac{(1-P(\xi ))p(\xi )}{\tau _p^+(\xi +1)p(\xi +1)} \\&= \frac{P(\xi -1)}{\tau _p^+(\xi )} + \frac{1-P(\xi )}{\tau _p^-(\xi )}. \end{aligned}$$

Furthermore, as \(x-{\mathbb {E}}[X] = \tau _p^+(x)-\tau _p^-(x)\), we have \(\tau ^-_p(\xi ) \ge \tau ^+_p(\xi )\) if \(\xi \le {\mathbb {E}}[X]\) (resp. \(\tau ^-_p(\xi ) \le \tau ^+_p(\xi )\) if \(\xi \ge {\mathbb {E}}[X]\)). Therefore, the supremum is bounded by \(\frac{P(\xi -1)+1-P(\xi )}{\tau _p^+(\xi )} =\frac{1-p(\xi )}{\tau _p^+(\xi )}\) if \(\xi \le {\mathbb {E}}[X]\) and otherwise by \(\frac{1-p(\xi )}{\tau _p^-(\xi )}\).

By remark 2.21, the solution \(g_A^\ell (x)\) is explicit and defined by \(g_\xi \) for \(\xi \in A\). The sign of \(g_\xi \) changes according to the relative position of \(\xi \) and x. Then, combined with the hypotheses, the maximal value of \(|g_A^-(x)|\) is either observed at \(x=\min _{\xi \in A}\{\xi \}=:\xi _{1}\) or \(x=\max _{\xi \in A}\{\xi \}+1=:\xi _{2}+1\). Then,

$$\begin{aligned} \sup _x |g^-_A(x)|&= \max \left\{ \frac{P(\xi _1-1)}{p(\xi _1)\tau _p^+(\xi _1)}\sum _{j\in A}p(j), \frac{1-P(\xi _2)}{p(\xi _2)\tau _p^-(\xi _2)}\sum _{j\in A}p(j) \right\} \\&\le \left( \sum _{j\in A}p(j) \right) \sup _{\xi \in A} \left\{ \frac{1}{\tau _p^+(\xi )p(\xi )}, \frac{1}{\tau _p^-(\xi )p(\xi )}\right\} . \end{aligned}$$

Finally, due to the monotonicity of each \(g_\xi (x)\) function, the maximal difference \(|\Delta g_A(x)|\) is bounded by the supremum of \(|\Delta g_\xi (x)|\) for \(\xi \in A\), which is enough to conclude. \(\square \)

Proof of Theorem 3.2

First take \(c_1(x)= c_2(x) = 1 \) in (3.4). Without any further assumptions on h, the solution \(g_h^*\) of (1.5) with \(c(x)=1\) can be represented as

$$\begin{aligned} g_h^*(x) = \frac{{\mathcal {L}}_\infty ^\ell h(x+\ell )}{c_1(x+\ell )} = {\mathcal {L}}_\infty ^\ell h(x+\ell ) \end{aligned}$$

Hence, we obtain (3.5).

Next take \(\eta _1 = \eta _2 = \mathrm {Id}\) in (3.3). Then, \(-{\mathcal {L}}_{\infty }^\ell \eta _1(x) = \tau _\infty ^\ell (x)\) and \(-{\mathcal {L}}_n^\ell \eta _2(x) = \tau _n^\ell (x)\), the Stein kernels of \(p_\infty \) and \(p_n\). Without any further assumptions on h, the solution \(g_h(x)\) of (1.4) with \(\eta =\mathrm {Id}\) can be represented as

$$\begin{aligned} g_h(x) = \frac{-{\mathcal {L}}_\infty ^\ell h(x+\ell )}{\tau _\infty (x+\ell )} \end{aligned}$$

Hence we get (3.6).

Appendix B: Some more inequalities

Corollary B.1

(Identity (3.6), Stein kernels and \(\ell = 0\)) Under the same assumptions and with exactly the same notations as in Corollary 3.4, the following results hold true.

  1. 1.

    The Kolmogorov distance between the random variables \(X_n\) and \(X_\infty \) is

    $$\begin{aligned}&\mathrm {Kol}(X_n,X_\infty ) \nonumber \\&\quad = \sup _z \Bigg | {\mathbb {E}}\Bigg [ \frac{\tau _n(X_n) -\tau _\infty (X_n)}{\tau _{\infty }(X_n)}{\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n) \end{aligned}$$
    (B.1)
    $$\begin{aligned}&\qquad \times \Bigg ( {P_{\infty }(z) - {\mathbb {I}}[X_n \le z]} + \frac{X_n -{\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \frac{P_{\infty }(X_n \wedge z) \bar{P}_{\infty }(X_n\vee z)}{p_{\infty }(X_n)} \Bigg ) \Bigg ] + \kappa _{\mathrm {Id}}(z) \Bigg | \nonumber \\&\quad \le {\mathbb {E}}\left[ \left| \frac{\tau _n(X_n)}{\tau _{\infty }(X_n)}-1 \right| \left( 1 + \frac{|X_n -{\mathbb {E}}[X_{\infty }]|}{\tau _{\infty }(X_n)} \frac{P_\infty (X_n)\bar{P}_\infty (X_n)}{p_\infty (X_n)} \right) {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n) \right] + \sup _z |\kappa _{\mathrm {Id}}(z)| \end{aligned}$$
    (B.2)

    where

    $$\begin{aligned} \kappa _{\mathrm {Id}}(z)&= (\mu _n - \mu _{\infty }) {\mathbb {E}} \left[ \frac{P_{\infty }(X_n \wedge z) \bar{P}_{\infty }(X_n\vee z)}{\tau _{\infty }(X_n)p_{\infty }(X_n)} \right] \\&\quad + \lim _{x \nearrow b_n \wedge b_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)}P_\infty (x\wedge z)\bar{P}_\infty (x\vee z)\\&\quad -\lim _{x \searrow a_n \vee a_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)}P_\infty (x\wedge z)\bar{P}_\infty (x\vee z). \end{aligned}$$
  2. 2.

    The total variation distance between \(X_n\) and \(X_\infty \) is

    $$\begin{aligned}&\mathrm {TV}(X_n, X_{\infty })\nonumber \\&\quad = \kappa _{\mathrm {Id}}({\mathbb {I}}_{A_{n}^{\infty }}) +{\mathbb {E}}\Bigg [ \frac{\tau _n(X_n) - \tau _\infty (X_n)}{\tau _{\infty }(X_n)}{\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n) \end{aligned}$$
    (B.3)
    $$\begin{aligned}&\qquad \times \Bigg ({P}_{\infty }(A_n^{\infty }) - {\mathbb {I}}_{ A_n^{\infty }}(X_n)\nonumber \\&\qquad +\frac{X_n -{\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \frac{ {P}_\infty (A_{n}^{\infty }\cap (-\infty , X_n])-{P}_\infty (A_{n}^{\infty }) P_{\infty }(X_n)}{p_\infty (X_n)} \Bigg ) \Bigg ] \nonumber \\&\quad \le {\mathbb {E}}\left[ \left| \frac{\tau _n(X_n)}{\tau _{\infty }(X_n)}-1 \right| \left( 1 + \frac{|X_n -{\mathbb {E}}[X_{\infty }]|}{\tau _{\infty }(X_n)} \frac{P_\infty (X_n)\bar{P}_\infty (X_n)}{p_\infty (X_n)} \right) {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n)\right] + \kappa _{\mathrm {Id}}({\mathbb {I}}_{A_{n}^{\infty }}) \end{aligned}$$
    (B.4)

    with

    $$\begin{aligned} \kappa _{\mathrm {Id}}({\mathbb {I}}_{A_{n}^{\infty }})&= \lim _{x \nearrow b_n \wedge b_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)} \left( {P}_\infty (A_{n}^{\infty }\cap (-\infty ,x]) -{P}_\infty (A_{n}^{\infty }) P_{\infty }(x)\right) \\&\quad - \lim _{x \searrow a_n \vee b_n}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_\infty (x)} \left( {P}_\infty (A_{n}^{\infty }\cap (-\infty ,x]) -{P}_\infty (A_{n}^{\infty }) P_{\infty }(x)\right) \\&\quad + (\mu _n-\mu _{\infty }) {\mathbb {E}} \left[ \frac{ {P}_\infty (A_{n}^{\infty } \cap (-\infty , X_n])-{P}_\infty (A_{n}^{\infty }) P_{\infty }(X_n)}{{\tau _{\infty }(X_n)} p_\infty (X_n)} \right] \end{aligned}$$
  3. 3.

    The Wasserstein distance between \(X_n\) and \(X_\infty \) is

    $$\begin{aligned} \mathrm {Wass}(X_n, X_{\infty })&= \sup _{h \in \mathrm {Lip}(1)} \Bigg | \kappa _{\mathrm {Id}}(h) \end{aligned}$$
    (B.5)
    $$\begin{aligned}&\quad + {\mathbb {E}}\left[ \frac{\tau _n(X_n) -\tau _\infty (X_n)}{\tau _{\infty }(X_n)} h'(X_\infty ) \right. \nonumber \\&\quad \left. \left( R_{\infty }(X_n, X_{\infty }) +\frac{X_n-{\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \tilde{K}_\infty (X_n,X_\infty )\right) {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n)\right] \Bigg | \nonumber \\&\quad \le 2 {\mathbb {E}}\left[ \left| \frac{\tau _n(X_n)}{\tau _{\infty }(X_n)}-1 \right| |X_n - {\mathbb {E}}[X_{\infty }]|{\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n)\right] + \sup _{h \in \mathrm {Lip}(1)} \kappa _{\mathrm {Id}}(h) \end{aligned}$$
    (B.6)

    where

    $$\begin{aligned} \kappa _{\mathrm {Id}}(h)&= \lim _{x \searrow a_n \vee a_{\infty }} \frac{\tau _n(x)}{\tau _\infty (x)}\frac{p_n(x)}{p_{\infty }(x)} \int _{a_{\infty }}^{b_{\infty }} h'(u) {P_{\infty }(x \wedge u) \bar{P}_{\infty }(x \vee u) } \mathrm {d}u\\&\quad - \lim _{x \nearrow b_n \wedge b_{\infty }}\frac{\tau _n(x)}{\tau _\infty (x)} \frac{p_n(x)}{p_{\infty }(x)} \int _{a_{\infty }}^{b_{\infty }}h'(u) {P_{\infty }(x \wedge u) \bar{P}_{\infty }(x \vee u) } \mathrm {d}u\\&\quad + (\mu _n - \mu _{\infty }) {\mathbb {E}}\left[ \frac{h'(X_\infty )}{\tau _{\infty }(X_{n})} \left( R_{\infty }(X_n, X_{\infty }) + \frac{X_n - {\mathbb {E}}[X_{\infty }]}{\tau _{\infty }(X_n)} \tilde{K}_\infty (X_n, X_\infty )\right) \right] \end{aligned}$$

Corollary B.2

(Identity (3.6), Stein kernels, \(\ell = \pm 1\)) Under the same assumptions and with exactly the same notations as in Corollary 3.6, the following results hold true.

$$\begin{aligned}&\mathrm {TV}(X_n, X_{\infty }) \\&\quad = \kappa _{\mathrm {Id}}^{\ell }({\mathbb {I}}_{A_{n}^{\infty }}) +{\mathbb {E}}\Bigg [ \frac{\tau _n^{\ell }(X_n) -\tau _\infty ^{\ell }(X_n)}{\tau _{\infty }^{\ell } (X_n)} {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n+\ell ) \\&\qquad \times \Bigg ({P}_{\infty }(A_n^{\infty }) - {\mathbb {I}}_{ A_n^{\infty }}(X_n) + \frac{X_n -{\mathbb {E}}[X_{\infty }]}{\tau _{\infty }^{\ell }(X_n+\ell )}\\&\qquad \frac{ {P}_\infty (A_{n}^{\infty }\cap (-\infty , X_n-{\mathbb {I}}[\ell = -1]])-{P}_\infty (A_{n}^{\infty }) P_{\infty }(X_n-{\mathbb {I}}[\ell = -1])}{p_\infty (X_n+\ell )} \Bigg ) \Bigg ] \\&\quad \le {\mathbb {E}}\left[ \left| \frac{\tau _n^{\ell }(X_n)}{\tau _{\infty }^{\ell }(X_n)}-1 \right| \right. \\&\qquad \left. \left( 1 + \frac{|X_n -{\mathbb {E}}[X_{\infty }]|}{\tau _{\infty }^{\ell }(X_n+\ell )} \frac{P_\infty (X_n-{\mathbb {I}}[\ell = -1])\bar{P}_\infty (X_n-{\mathbb {I}}[\ell = -1])}{p_\infty (X_n+\ell )} \right) {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n+\ell ) \right] \\&\qquad + \kappa _{\mathrm {Id}}^{\ell }({\mathbb {I}}_{A_{n}^{\infty }}) \end{aligned}$$

with

$$\begin{aligned} \kappa _{\mathrm {Id}}^{+}({\mathbb {I}}_{A_{n}^{\infty }})&=-\lim _{x \searrow a_n \vee a_{\infty }}\frac{\tau _n^{+} (x)}{\tau _{\infty }^{+}(x)}\frac{p_n(x)}{p_\infty (x)} \left( {P}_\infty (A_{n}^{\infty }\cap (-\infty ,x-1]) -{P}_\infty (A_{n}^{\infty }) P_{\infty }(x-1)\right) \\&\quad + (\mu _\infty -\mu _n) {\mathbb {E}} \left[ \frac{ {P}_\infty (A_{n}^{\infty }\cap (-\infty , X_n])-{P}_\infty (A_{n}^{\infty }) P_{\infty }(X_n)}{{\tau _{\infty }^{+}(X_n+1)} p_\infty (X_n+1)} {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n+1)\right] \end{aligned}$$

and

$$\begin{aligned} \kappa _{\mathrm {Id}}^{-}({\mathbb {I}}_{A_{n}^{\infty }})&=\lim _{x \nearrow b_n \wedge b_{\infty }}\frac{\tau _n^{-}(x)}{\tau _{\infty }^{-}(x)} \frac{p_n(x)}{p_\infty (x)} \left( {P}_\infty (A_{n}^{\infty }\cap (-\infty ,x])-{P}_\infty (A_{n}^{\infty }) P_{\infty }(x)\right) \\&\quad + (\mu _\infty -\mu _{n}) {\mathbb {E}} \left[ \frac{ {P}_\infty (A_{n}^{\infty }\cap (-\infty ,X_n]) -{P}_\infty (A_{n}^{\infty }) P_{\infty }(X_n)}{{\tau _{\infty }^{-}(X_n-1)} p_\infty (X_n-1)} {\mathbb {I}}_{{\mathcal {S}}_{\infty }}(X_n-1)\right] \end{aligned}$$

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ernst, M., Swan, Y. Distances Between Distributions Via Stein’s Method. J Theor Probab 35, 949–987 (2022). https://doi.org/10.1007/s10959-021-01075-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10959-021-01075-8

Keywords

  • Stein’s method
  • Stein equations
  • Stein factors
  • Kolmogorov distance
  • Wasserstein distance
  • Total variation distance
  • Integral probability metrics

Mathematics Subject Classification (2020)

  • 47N30
  • 62E17