Abstract
Wasserstein geometry and information geometry are two important structures to be introduced in a manifold of probability distributions. Wasserstein geometry is defined by using the transportation cost between two distributions, so it reflects the metric of the base manifold on which the distributions are defined. Information geometry is defined to be invariant under reversible transformations of the base space. Both have their own merits for applications. In this study, we analyze statistical inference based on the Wasserstein geometry in the case that the base space is one-dimensional. By using the location-scale model, we further derive the W-estimator that explicitly minimizes the transportation cost from the empirical distribution to a statistical model and study its asymptotic behaviors. We show that the W-estimator is consistent and explicitly give its asymptotic distribution by using the functional delta method. The W-estimator is Fisher efficient in the Gaussian case.
Similar content being viewed by others
References
Amari, S. (2016). Information geometry and its applications. New York: Springer.
Amari, S., Karakida, R., Oizumi, M. (2018). Information geometry connecting Wasserstein distance and Kullback–Leibler divergence via the entropy-relaxed transportation problem. Information Geometry, 1, 13–37.
Amari, S., Karakida, R., Oizumi, M., Cuturi, M. (2019). Information geometry for regularized optimal transport and barycenters of patterns. Neural Computation, 31, 827–848.
Arjovsky, M., Chintala, S., Bottou, L. (2017). Wasserstein GAN. arXiv:1701.07875.
Bassetti, F., Bodini, A., Regazzini, E. (2006). On minimum Kantorovich distance estimators. Statistics & Probability Letters, 76, 1298–1302.
Bernton, E., Jacob, P. E., Gerber, M., Robert, C. P. (2019). On parameter estimation with the Wasserstein distance. Information and Inference: A Journal of the IMA, 8, 657–676.
Fronger, C., Zhang, C., Mobahi, H., Araya-Polo, M., Poggio, T. (2015). Learning with a Wasserstein loss. Advances in Neural Information Processing Systems 28 (NIPS 2015).
Li, W., Montúfar, G. (2020). Ricci curvature for parametric statistics via optimal transport. Information Geometry, 3, 89–117.
Li, W., Zhao, J. (2019). Wasserstein information matrix. arXiv:1910.11248.
Matsuda, T., Strawderman, W. E. (2021). Predictive density estimation under the Wasserstein loss. Journal of Statistical Planning and Inference, 210, 53–63.
Montavon, G., Müller, K. R., Cuturi, M. (2015). Wasserstein training for Boltzmann machine. Advances in Neural Information Processing Systems 29 (NIPS 2016).
Peyré, G., Cuturi, M. (2019). Computational optimal transport: With applications to data science. Foundations and Trends in Machine Learning, 11, 355–607.
Santambrogio, F. (2015). Optimal transport for applied mathematicians. New York: Springer.
Takatsu, A. (2011). Wasserstein geometry of Gaussian measures. Osaka Journal of Mathematics, 48, 1005–1026.
van der Vaart, A. W. (1998). Asymptotic statistics. Cambridge: Cambridge University Press.
Villani, C. (2003). Topics in optimal transportation. New York: American Mathematical Society.
Villani, C. (2009). Optimal transport: Old and new. New York: Springer.
Wang, Y., Li, W. (2020). Information Newton’s flow: Second-order optimization method in probability space. arXiv:2001.04341.
Acknowledgements
We thank the associate editor and referees for helpful comments. We thank Emi Namioka for drawing the figures. Takeru Matsuda was supported by JSPS KAKENHI Grant Number 19K20220.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Amari, Si., Matsuda, T. Wasserstein statistics in one-dimensional location scale models. Ann Inst Stat Math 74, 33–47 (2022). https://doi.org/10.1007/s10463-021-00788-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-021-00788-1