Skip to main content
Log in

Information geometry

  • Special Feature: The Takagi Lectures
  • Published:
Japanese Journal of Mathematics Aims and scope

Abstract

Information geometry has emerged from the study of the invariant structure in families of probability distributions. This invariance uniquely determines a second-order symmetric tensor g and third-order symmetric tensor T in a manifold of probability distributions. A pair of these tensors (g, T) defines a Riemannian metric and a pair of affine connections which together preserve the metric. Information geometry involves studying a Riemannian manifold having a pair of dual affine connections. Such a structure also arises from an asymmetric divergence function and affine differential geometry. A dually flat Riemannian manifold is particularly useful for various applications, because a generalized Pythagorean theorem and projection theorem hold. The Wasserstein distance gives another important geometry on probability distributions, which is non-invariant but responsible for the metric properties of a sample space. I attempt to construct information geometry of the entropy-regularized Wasserstein distance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. Amari, Differential-Geometrical Methods in Statistics, Lect. Notes Stat., 28, Springer-Verlag, 1985.

  2. S. Amari, Estimating functions of independent component analysis for temporally correlated signals, Neural Computation, 12 (2000), 2083–2107.

    Article  Google Scholar 

  3. S. Amari, Information Geometry and Its Applications, Appl. Math. Sci., 194, Springer-Verlag, 2016.

  4. S. Amari and J.-F. Cardoso, Blind source separation—Semiparametric statistical approach, IEEE Trans. Signal Process., 45 (1997), 2692–2700.

    Article  Google Scholar 

  5. S. Amari, R. Karakida and M. Oizumi, Information geometry connecting Wasserstein distance and Kullback–Leibler divergence via the entropy-relaxed transportation problem, Inf. Geom., 1 (2018), 13–37.

    Article  MathSciNet  Google Scholar 

  6. S. Amari, R. Karakida, M. Oizumi and M. Cuturi, Information geometry for regularized optimal transport and barycenters of patterns, Neural Comput., 31 (2019), 827–848.

    Article  MathSciNet  Google Scholar 

  7. S. Amari and M. Kawanabe, Information geometry of estimating functions in semi-parametric statistical models, Bernoulli, 3 (1997), 29–54.

    Article  MathSciNet  Google Scholar 

  8. S. Amari and T. Matsuda, Wasserstein statistics in one-dimensional location-scale model, preprint, arXiv:2007.11401.

  9. S. Amari and H. Nagaoka, Methods of Information Geometry, Transl. Math. Monogr., 191, Amer. Math. Soc., Providence, RI; Oxford Univ. Press, 2000.

    MATH  Google Scholar 

  10. S. Amari, A. Ohara and H. Matsuzoe, Geometry of deformed exponential families: Invariant, dually-flat and conformal geometries, Phys. A, 391 (2012), 4308–4319.

    Article  MathSciNet  Google Scholar 

  11. N. Ay, J. Jost, H.V. Lê and L. Schwachhöfer, Information Geometry, Ergeb. Math. Grenzgeb. (3), 64, Springer-Verlag, 2017.

  12. A. Banerjee, S. Merugu, I.S. Dhillon and J. Ghosh, Clustering with Bregman divergences, J. Mach. Learn. Res., 6 (2005), 1705–1749.

    MathSciNet  MATH  Google Scholar 

  13. M. Bauer, M. Bruveris and P.W. Michor, Uniqueness of the Fisher–Rao metric on the space of smooth densities, Bull. Lond. Math. Soc., 48 (2016), 499–506.

    Article  MathSciNet  Google Scholar 

  14. L.M. Brègman, The relaxation method of finding a common point of convex sets and its applications to the solution of problems in convex programming, U.S.S.R. Comput. Math. and Math. Phys., 7 (1967), 200–217.

    Article  MathSciNet  Google Scholar 

  15. A. Cena and G. Pistone, Exponential statistical manifold, Ann. Inst. Statist. Math., 59 (2007), 27–56.

    Article  MathSciNet  Google Scholar 

  16. N.N. Chentsov, Statistical Decision Rules and Optimal Inference, Transl. Math. Monogr., 53, Amer. Math. Soc., Providence, RI, 1982; Originally published in Russian, Nauka, 1972.

    MATH  Google Scholar 

  17. I. Csiszár, Information-type measures of difference of probability distributions and indirect observation, Studia Sci. Math. Hungar., 2 (1967), 299–318.

    MathSciNet  MATH  Google Scholar 

  18. M. Cuturi, Sinkhorn distance: Lightspeed computation of optimal transport, Advances in Neural Information Processing Systems, 26 (2013), 2292–2300.

    Google Scholar 

  19. M. Cuturi and G. Peyré, A smoothed dual approach for variational Wasserstein problems, SIAM J. Imaging Sci., 9 (2016), 320–343.

    Article  MathSciNet  Google Scholar 

  20. J.G. Dowty, Chentsov’s theorem for exponential families, Inf. Geom., 1 (2018), 117–135.

    Article  MathSciNet  Google Scholar 

  21. S. Eguchi, Second order efficiency of minimum contrast estimators in a curved exponential family, Ann. Statist., 11 (1983), 793–803.

    Article  MathSciNet  Google Scholar 

  22. J. Feydy, T. Séjourné, F.-X. Vialard, S. Amari, A. Trouvé and G. Peyré, Interpolating between optimal transport and MMD using Sinkhorn divergences, In: The 22nd International Conference on Artificial Intelligence and Statistics, Proc. Mach. Learn. Res. (PMLR), 89, PMLR, 2019, pp. 2681–2690.

  23. A. Fujiwara, Foundations of Information Geometry, Makino Shoten, 2015.

  24. A. Genevay, G. Peyré and M. Cuturi, Learning generative models with Shinkhorn divergences, In: International Conference on Artificial Intelligence, and Statistics, Proc. Mach. Learn. Res. (PMLR), 84, PMLR, 2018, pp. 1608–1617.

  25. M. Hayashi, Quantum Information Theory: Mathematical Foundation. 2nd ed., Grad. Texts Phys., Springer-Verlag, 2017.

  26. T. Kurose, On the divergences of 1-conformally flat statistical manifolds, Tohoku, Math. J., 46 (1994), 427–433.

    MathSciNet  MATH  Google Scholar 

  27. T. Kurose, Dual connections and projective geometry, Fukuoka Univ. Sci. Rep., 29 (1999), 221–224.

    MathSciNet  MATH  Google Scholar 

  28. T. Kurose, Conformal-projective geometry of statistical manifolds, Interdiscip. Inform. Sci., 8 (2002), 89–100.

    MathSciNet  MATH  Google Scholar 

  29. S.L. Lauritzen, Statistical manifolds, In: Differential Geometry in Statistical Inference, Institute of Mathematical Statistics, Lecture Notes Monograph Series, 10, Institute of Mathematical Statistics, 1987, pp. 23–33.

  30. H.V. Lê, Statistical manifolds are statistical models, J. Geom., 84 (2005), 83–93.

    Article  MathSciNet  Google Scholar 

  31. W. Li and J. Zhao, Wasserstein information matrix, preprint, arXiv:1910.11248.

  32. H. Matsuzoe, On realization of conformally-projectively flat statistical manifolds and the divergences, Hokkaido Math. J., 27 (1998), 409–421.

    Article  MathSciNet  Google Scholar 

  33. H. Matsuzoe, Geometry of contrast functions and conformal geometry, Hiroshima Math. J., 29 (1999), 175–191.

    Article  MathSciNet  Google Scholar 

  34. H. Matsuzoe, Statistical manifolds and affine differential geometry, In: Probabilistic Approach to Geometry, Adv. Stud. Pure Math., 57, Math. Soc. Japan, Tokyo, 2010, pp. 303–321.

    Chapter  Google Scholar 

  35. T. Matumoto, Any statistical minifold has a contrast function—On the C3-functions taking the minimum at the diagonal of the product manifold, Hiroshima Math. J., 23 (1993), 327–332.

    Article  MathSciNet  Google Scholar 

  36. K. Miura, M. Okada and S. Amari, Estimating spiking irregularities under changing environments, Neural Comput., 18 (2006), 2359–2386.

    Article  MathSciNet  Google Scholar 

  37. T. Morimoto, Markov processes and the H-theorem, J. Phys. Soc. Japan, 18 (1963), 328–331.

    Article  MathSciNet  Google Scholar 

  38. J. Naudts, Generalised Thermostatistics, Springer-Verlag, 2011.

  39. K. Nomizu and T. Sasaki, Affine Differential Geometry, Cambridge Tracts in Math., 111, Cambridge Univ. Press, Cambridge, 1994.

    Google Scholar 

  40. G. Peyré and M. Cuturi, Computational optimal transport, preprint, arXiv:1803.00567.

  41. G. Pistone and C. Sempi, An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one, Ann. Statist., 23 (1995), 1543–1561.

    Article  MathSciNet  Google Scholar 

  42. C. Radhakrishna Rao, Information and accuracy attainable in the estimation of statistical parameters, Bull. Calcutta Math. Soc., 37 (1945), 81–91.

    MathSciNet  MATH  Google Scholar 

  43. A. Ramdas, N. García Trillos and M. Cuturi, On Wasserstein two-sample testing and related families of nonparametric tests, Entropy, 19 (2017), no. 47.

  44. F. Santambrogio, Optimal Transport for Applied Mathematicians, Progr. Nonlinear Differential Equations Appl., 87, Birkhäuser, 2015.

  45. H. Shima, The Geometry of Hessian Structures, World Sci. Publ., 2007.

  46. C. Tsallis, Introduction to Nonextensive Statistical Mechanics. Approaching a Complex World, Springer-Verlag, 2009.

  47. C. Villani, Optimal Transport. Old and New, Grundlehren Math. Wiss., 338, Springer-Verlag, 2009.

  48. T.-K.L. Wong, Logarithmic divergences from optimal transport and Rényi geometry, Inf. Geom., 1 (2018), 39–78.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shun-ichi Amari.

Additional information

Communicated by: Toshiyuki Kobayashi

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is based on the 23rd Takagi Lectures that the author delivered at Research Institute for Mathematical Sciences, Kyoto University on June 8, 2019.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amari, Si. Information geometry. Jpn. J. Math. 16, 1–48 (2021). https://doi.org/10.1007/s11537-020-1920-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11537-020-1920-5

Keywords and phrases

Mathematics Subject Classification (2020)

Navigation