Probability distribution as a path and its action integral

Takeuchi, Hiroyuki

doi:10.1007/s42081-019-00062-y

Probability distribution as a path and its action integral

Original Paper
Published: 02 December 2019

Volume 3, pages 485–511, (2020)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Hiroyuki Takeuchi¹

212 Accesses
1 Altmetric
Explore all metrics

Abstract

To describe the convergence in law of a sequence of probability distributions, “the principle of least action” is introduced nonparametrically into statistics. A probability measure should be treated as a path (in some sense) to apply calculus of variations, and it is shown that saddlepoints, which appear in the method of saddlepoint approximations, play a crucial role. An action integral, i.e., a functional of the saddlepoint, is defined as a definite integral of entropy. As a saddlepoint equation naturally appears in the Gâteaux derivative of that integral, a unique saddlepoint may be found as an optimal path for this variations problem. Consequently, by virtue of the unique correspondence between probability measures and saddlepoints, the convergence in law is clearly described by a decreasing sequence of action integrals. Thereby, a new criterion for evaluating the convergence is introduced into statistics and a novel interpretation of saddlepoints is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the rate of convergence in Wasserstein distance of the empirical measure

Article 18 October 2014

Stability of the Faber-Krahn inequality for the short-time Fourier transform

Article Open access 01 March 2024

LOCALIZATION OPERATORS AND SHAPIRO’S INEQUALITY FOR THE STURM-LIOUVILLE-STOCKWELL TRANSFORM

Article 03 May 2024

References

Amari, S. (1990). Differential-geometrical methods in statistics (2nd ed.). Berlin: Springer.
MATH Google Scholar
Bahadur, R. R. (1971). Some limit theorems in statistics, no 4 in regional conference series in applied mathematics. Philadelphia: SIAM.
Book Google Scholar
Barndorff-Nielsen, O. E., & Cox, D. R. (1989). Asymptotic techniques for use in statistics. London: Chapman and Hall.
Book Google Scholar
Brazzale, A. R., Davison, A. C., & Reid, N. (2007). Applied asymptotics. Cambridge: Cambridge University Press.
Book Google Scholar
Butler, R. W. (2007). Saddlepoint approximations with applications. Cambridge: Cambridge University Press.
Book Google Scholar
Daniels, H. E. (1954). Saddlepoint approximations in statistics. Annals of Mathematical Statistics, 25, 631–650.
Article MathSciNet Google Scholar
Daniels, H. E. (1980). Exact saddlepoint approximations. Biometrika, 67, 59–63.
Article MathSciNet Google Scholar
DasGupta, A. (2008). Asymptotic theory of statistics and probability. New York: Springer.
MATH Google Scholar
de Bruijn, N. G. (1970). Asymptotic methods in analysis (3rd ed.). Amsterdam: North Holland.
Google Scholar
Dembo, A., & Zeitouni, O. (1998). Large deviations techniques and applications (2nd ed.). New York: Springer.
Book Google Scholar
Dupuis, P., & Ellis, R. S. (1997). A weak convergence approach to the theory of large deviations. New York: Wiley.
Book Google Scholar
Ellis, R. S. (2006). Entropy, large deviations, and statistical mechanics, classics in mathematics. Berlin: Springer.
Book Google Scholar
Field, C. A. (1985). Approach to normality of mean and M-estimators of location. Canadian Journal of Statistics, 13, 201–210.
Article MathSciNet Google Scholar
Field, C. A., & Ronchetti, E. M. (1990). Small sample asymptotics. Hayward: Institute of Mathematical Statistics.
MATH Google Scholar
Gelfand, I. M. (1963). Calculus of variations. New Jersey: Prentice-Hall.
Google Scholar
Hall, P. (1992). The bootstrap and edgeworth expansion. New York: Springer.
Book Google Scholar
Jensen, J. L. (1995). Saddlepoint approximations. Oxford: Clarendon Press.
MATH Google Scholar
Kolassa, J. E. (1997). Series approximation methods in statistics. Lecture notes in statistics (Vol. 88). New York: Springer.
MATH Google Scholar
Kotz, S., Johnson, N. L., & Read, C. B. (1983). Encyclopedia of statistical sciences (Vol. 3). New York: Wiley.
MATH Google Scholar
Laha, R. G., & Rohatgi, V. K. (1979). Probability theory. New York: Wiley.
MATH Google Scholar
Lukacs, E. (1970). Characteristic functions (2nd ed.). London: Charles Griffin.
MATH Google Scholar
Serfling, R. J. (1980). Approximation theorems of mathematical statistics. New York: Wiley.
Book Google Scholar
Shiryaev, A. N. (1996). Probability (2nd ed.). New York: Springer.
Book Google Scholar
Takeuchi, H. (2006). Tauberian property in saddlepoint approximations. Bulletin of Informatics and Cybernetics, 38, 59–69.
Article MathSciNet Google Scholar
Takeuchi, H. (2013). Correspondence between saddlepoint and probability distribution. Journal of the Japan Statistical Society, 42(2), 185–208. (in Japanese).
MathSciNet MATH Google Scholar
Takeuchi, H. (2014). On a convexity of saddlepoint and its curvature. Journal of the Japan Statistical Society, 44(1), 1–17. (in Japanese).
Article MathSciNet Google Scholar
Takeuchi, H. (2015). The sp-transform of probability distributions. Journal of the Japan Statistical Society, 45(1), 19–40. (in Japanese).
MathSciNet Google Scholar
Takeuchi, H. (2016). On $\gamma$-decomposition of probability distributions. Journal of the Japan Statistical Society, 45(2), 231–245. (in Japanese).
MathSciNet Google Scholar
Takeuchi, H. (2017). On a comparison between Lévy’s inversion formula and saddlepoint approximations. Journal of the Japan Statistical Society, 46(2), 113–135. (in Japanese).
MathSciNet Google Scholar

Download references

Acknowledgements

The author would like to express his sincere thanks to the referees for their insightful comments.

Author information

Authors and Affiliations

Department of Economics, Tokyo International University, 13-1 Matobakita 1-chome, Kawagoe, Saitama, 350-1197, Japan
Hiroyuki Takeuchi

Authors

Hiroyuki Takeuchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hiroyuki Takeuchi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Lemma 8.1

Let $\alpha (t)$ be the saddlepoint of a probability distribution $F \in {{\mathcal {P}}}$. The relationship between the i-th differential coefficient of $\alpha (t)$ at $t = \mu$ and the i-th cumulant $\kappa _i$ of F is as follows:

$$\begin{aligned} \alpha (\mu )= & \,\, 0, \quad \alpha '(\mu ) = \frac{1}{\kappa _2}, \quad \alpha ''(\mu ) = - \frac{\kappa _3}{\kappa _2^3}, \quad \alpha ^{(3)}(\mu ) = 3 \frac{\kappa _3^2}{\kappa _2^5} - \frac{\kappa _4}{\kappa _2^4}, \nonumber \\ \alpha ^{(4)}(\mu )= & {} - \! 15 \frac{\kappa _3^3}{\kappa _2^7} + 10 \frac{\kappa _3 \kappa _4}{\kappa _2^6} - \frac{\kappa _5}{\kappa _2^5}, \nonumber \\ \alpha ^{(5)}(\mu )= & {} \,\, 105 \frac{\kappa _3^4}{\kappa _2^9} - 105 \frac{\kappa _3^2 \kappa _4}{\kappa _2^8} + \frac{15 \kappa _3 \kappa _5 + 10 \kappa _4^2}{\kappa _2^7} - \frac{\kappa _6}{\kappa _2^6}. \ \cdots \end{aligned}$$

(23)

Conversely, let $\alpha ^{(i)} :=\alpha ^{(i)}(\mu )$. Then the cumulants are as follows.

$$\begin{aligned} \kappa _1 & = \alpha ^{-1}(0), \quad \kappa _2 = \frac{1}{\alpha '}, \quad \kappa _3 = - \frac{\alpha ''}{(\alpha ')^3}, \quad \kappa _4 = 3 \frac{(\alpha '')^2}{(\alpha ')^5} - \frac{\alpha ^{(3)}}{(\alpha ')^4}, \nonumber \\ \kappa _5 & = - 15 \frac{(\alpha '')^3}{(\alpha ')^7} + 10 \frac{\alpha ^{(3)} \alpha ''}{(\alpha ')^6} - \frac{\alpha ^{(4)}}{(\alpha ')^5}, \ \cdots \end{aligned}$$

(24)

Proof

(23) can be obtained by continuous differentiation of the saddlepoint equation (2) with respect to t. (24) is an obvious consequence of (23). $\square$

Hereafter, we suppose that F has a finite support $[-L, L] \subset {{\mathbb {R}}}$. Let $F_n$ be the empirical distribution of F, and $M_n$ and M their moment generating functions, respectively.

Lemma 8.2

We suppose that a function $\alpha : t \in {{\mathbb {R}}} \mapsto \alpha (t) \in {{\mathbb {R}}}$ belongs to the $C^1$ class and is strictly increasing on a neighborhood of $t = \mu$. If we define

$$\begin{aligned} \Delta (\varepsilon ) :=\sup _{t \in I_{\delta }} \{ \alpha (t + \varepsilon ) - \alpha (t), \, \alpha (t) - \alpha (t - \varepsilon ) \}, \end{aligned}$$

(25)

for some $\delta > 0$, then we have the following for sufficiently small $\varepsilon > 0$:

(i)
$\Delta (\varepsilon ) > 0$.
(ii)
There exists a $\delta ' > \delta$ such that $\Delta (\varepsilon ) \le \varepsilon \max _{{t \in I_{{\delta ^{\prime}}} }} \alpha ^{\prime}(t)$.

Proof

By assumption, there exists some $\delta ' > 0$ such that $\alpha = \alpha (t)$ is strictly increasing on $I_{\delta '}$. We fix any $\delta$ with $0< \delta < \delta '$, and for any $\varepsilon$ such that $0< \varepsilon < \delta ' - \delta$ we have $I_{\delta + \varepsilon } \subset I_{\delta '}$. Therefore, $\alpha = \alpha (t)$, $\alpha = \alpha (t + \varepsilon )$, and $\alpha = \alpha (t - \varepsilon )$ are well defined; furthermore, they are uniformly continuous and strictly increasing on $I_{\delta }$. We define $\Delta ^+ (\varepsilon )$ and $\Delta ^-(\varepsilon )$ by

$$\Delta ^+ (\varepsilon ) :=\sup _{t \in I_{\delta }} \{ \alpha (t + \varepsilon ) - \alpha (t) \} , \quad \Delta ^-(\varepsilon ) :=\sup _{t \in I_{\delta }} \{ \alpha (t) - \alpha (t - \varepsilon ) \}.$$

(i) If $t' \in I_{\delta }$ and $\varepsilon > 0$, then $\Delta ^+ (\varepsilon ) \ge \alpha (t' + \varepsilon ) - \alpha (t') > 0.$ Likewise, $\Delta ^-(\varepsilon ) > 0$ is also true. Hence, we have $\Delta (\varepsilon ) = \max \{ \Delta ^+ (\varepsilon ), \, \Delta ^- (\varepsilon ) \} > 0$.

(ii) As $\alpha (t) = \alpha (t - \varepsilon ) + \varepsilon \alpha '(t - \theta \varepsilon )$ for some $\theta$ $(0< \theta < 1)$, we have $\alpha (t) - \alpha (t - \varepsilon ) \le \varepsilon \max _{t \in I_{\delta + \varepsilon }} \alpha '(t)$ and $\alpha (t + \varepsilon ) - \alpha (t) \le \varepsilon \max _{t \in I_{\delta + \varepsilon }} \alpha '(t)$ for $t \in I_{\delta }$. Hence, $\Delta (\varepsilon ) \le \varepsilon \max _{t \in I_{\delta + \varepsilon }} \alpha '(t) \le \varepsilon \max _{t \in I_{\delta '}} \alpha '(t).$ $\square$

Lemma 8.3

We assume that a function $t : \alpha \in {{\mathbb {R}}} \rightarrow t(\alpha ) \in {{\mathbb {R}}}$ belongs to the $C^1$ class and is strictly increasing on a neighborhood of the origin with $t(0) = \mu$. Furthermore, for a sequence of continuous functions $\{ t_n \}_{n \in {{\mathbb {N}}}}$ we assume that

$$\lim _{n \rightarrow \infty } \sup _{|\alpha | \le \eta } |t_n(\alpha ) - t(\alpha )| = 0,$$

(26)

for some $\eta > 0$. Then there exists a $\delta > 0$ such that

$$\lim _{n \rightarrow \infty } \sup _{t \in I_{\delta }} |\alpha _n(t) - \alpha (t)| = 0,$$

(27)

where $\alpha _n(t)$ and $\alpha (t)$ are the inverse functions of $t_n(\alpha )$ and $t(\alpha )$, respectively.

Proof

By (26), for any $\varepsilon > 0$ there exists an integer $N_0$ such that if $n \ge N_0$, then we have $(\alpha , t_n(\alpha )) \in \{ (\alpha , t) : |\alpha | \le \eta , \, |t - t(\alpha )| < \varepsilon \}.$ Moreover, there exists a positive constant $\delta$, which does not depend on $\varepsilon$, such that $t(-\eta ) + \varepsilon< \mu - \delta< \mu< \mu + \delta < t(+\eta ) - \varepsilon .$ With this $\delta$, we have

$$(t, \alpha _n(t)) \in \{ (t, \alpha ) : |t - \mu | \le \delta , \ \alpha (t - \varepsilon )< \alpha < \alpha (t + \varepsilon ) \},$$

(28)

for $n \ge N_0$. As the quantity $\Delta (\varepsilon )$ defined in Lemma 8.2 is positive and independent of t, we have for $n \ge N_0$

$$(t, \alpha _n(t)) \in \{ (t, \alpha ) : |t - \mu | \le \delta , \ |\alpha - \alpha (t)| < \Delta (\varepsilon ) \} ,$$

by (28). Hence,

$$\limsup _{n \rightarrow \infty } \sup _{|t - \mu | \le \delta } |\alpha _n(t) - \alpha (t)| \le \Delta (\varepsilon ),$$

and by Lemma 8.2 (ii), the conclusion follows upon letting $\varepsilon \rightarrow 0$. $\square$

The following corollary to Lemma 8.3 holds obviously.

Corollary 8.1

If we replace $t(\alpha )$ with its inverse $\alpha (t)$ in Lemma 8.3, then the same conclusion follows under the condition $\alpha (\mu ) = 0$.

Lemma 8.4

For any fixed $\eta > 0$, we have the following:

$$\begin{aligned}&\mathrm{(i)} \sup _{|\alpha | \le \eta } |M_n(\alpha ) - M(\alpha )| \le 4e^{\eta L} \sup _{|x| \le L} |F_n(x) - F(x)|, \quad \text{ a.s. } \\&\mathrm{(ii)} \sup _{|\alpha | \le \eta } |M'_n(\alpha ) - M'(\alpha )| \le 4L e^{\eta L} \sup _{|x| \le L} |F_n(x) - F(x)|, \quad \text{ a.s. } \end{aligned}$$

Proof

(i) As $F_n(x) - F(x)$ is of bounded variation for $|x| \le L$,

$$\begin{aligned}&\sup _{|\alpha | \le \eta } |M_n(\alpha ) - M(\alpha )| \\&\quad = \sup _{|\alpha | \le \eta } \left| \bigl [ e^{\alpha x} \{ F_n(x) - F(x) \} \bigr ]_{-L}^{L} - \int _{-L}^{L} \{ F_n(x) - F(x) \} \, {\text{d}}e^{\alpha x} \right| \\&\quad \le 4e^{\eta L} \sup _{|x| \le L} |F_n(x) - F(x)|, \quad \text{ a.s. } \end{aligned}$$

(ii) $\left| \frac{\partial }{\partial \alpha } e^{\alpha x} \right|$ is F-integrable, as for $|\alpha | \le \eta$ we have $\left| \frac{\partial }{\partial \alpha } e^{\alpha x} \right| \le |x| e ^{\eta |x|}$ and

$$\int _{-L}^{L} |x| e ^{\eta |x|} \, {\text{d}}F(x) \le L e^{\eta L}.$$

Therefore, we can interchange differentiation and integration:

$$M'(\alpha ) = \frac{\partial }{\partial \alpha } \int _{-L}^{L} e^{\alpha x} \, {\text{d}}F(x) = \int _{-L}^{L} x e^{\alpha x} \, {\text{d}}F(x).$$

Hence,

$$\begin{aligned}&\sup _{|\alpha | \le \eta } |M'_n(\alpha ) - M'(\alpha )| \\&\quad = \sup _{|\alpha | \le \eta } \left| \bigl [ x e^{\alpha x} \{ F_n(x) - F(x) \} \bigr ]_{-L}^{L} - \int _{-L}^{L} \{ F_n(x) - F(x) \} \, {\text{d}} x e^{\alpha x} \right| \\&\quad \le 4L e^{\eta L} \sup _{|x| \le L} |F_n(x) - F(x)|, \end{aligned}$$

with probability one. $\square$

Lemma 8.5

For some $\eta > 0$, we have $\inf _{|\alpha | \le \eta } M(\alpha ) > 0.$ Furthermore, there exists $C > 0$ such that $\inf _{|\alpha | \le \eta } M_n(\alpha ) \ge C$ for sufficiently large n with probability one.

Proof

For any sufficiently small $\varepsilon > 0$, there exists an $\eta > 0$ such that if $|\alpha | \le \eta$, then $M(\alpha ) > 1 - \varepsilon$. Thus, $\inf _{|\alpha | \le \eta } M(\alpha ) \ge 1 - \varepsilon > 0.$ By Lemma 8.4, for any $\varepsilon ' > 0$, there exists an $N_0 \ge 1$ such that if $n \ge N_0$, then $M_n(\alpha ) \ge M(\alpha ) - \varepsilon '$ for all $|\alpha | \le \eta$. Hence, if $\varepsilon ' < 1 - \varepsilon$ and $C :=1 - \varepsilon - \varepsilon '$, then

$$\begin{aligned} \inf _{|\alpha | \le \eta } M_n(\alpha ) \ge \inf _{|\alpha | \le \eta } M(\alpha ) - \varepsilon ' \ge 1 - \varepsilon - \varepsilon ' = C > 0, \end{aligned}$$

for sufficiently large n with probability one. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Takeuchi, H. Probability distribution as a path and its action integral. Jpn J Stat Data Sci 3, 485–511 (2020). https://doi.org/10.1007/s42081-019-00062-y

Download citation

Received: 11 June 2019
Accepted: 04 November 2019
Published: 02 December 2019
Issue Date: December 2020
DOI: https://doi.org/10.1007/s42081-019-00062-y

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probability distribution as a path and its action integral

Abstract

Access this article

Similar content being viewed by others

On the rate of convergence in Wasserstein distance of the empirical measure

Stability of the Faber-Krahn inequality for the short-time Fourier transform

LOCALIZATION OPERATORS AND SHAPIRO’S INEQUALITY FOR THE STURM-LIOUVILLE-STOCKWELL TRANSFORM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Lemma 8.1

Proof

Lemma 8.2

Proof

Lemma 8.3

Proof

Corollary 8.1

Lemma 8.4

Proof

Lemma 8.5

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Probability distribution as a path and its action integral

Abstract

Access this article

Similar content being viewed by others

On the rate of convergence in Wasserstein distance of the empirical measure

Stability of the Faber-Krahn inequality for the short-time Fourier transform

LOCALIZATION OPERATORS AND SHAPIRO’S INEQUALITY FOR THE STURM-LIOUVILLE-STOCKWELL TRANSFORM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 8.1

Proof

Lemma 8.2

Proof

Lemma 8.3

Proof

Corollary 8.1

Lemma 8.4

Proof

Lemma 8.5

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation