Wasserstein Riemannian geometry of Gaussian densities

Abstract

The Wasserstein distance on multivariate non-degenerate Gaussian densities is a Riemannian distance. After reviewing the properties of the distance and the metric geodesic, we present an explicit form of the Riemannian metrics on positive-definite matrices and compute its tensor form with respect to the trace inner product. The tensor is a matrix which is the solution to a Lyapunov equation. We compute the explicit formula for the Riemannian exponential, the normal coordinates charts and the Riemannian gradient. Finally, the Levi-Civita covariant derivative is computed in matrix form together with the differential equation for the parallel transport. While all computations are given in matrix form, nonetheless we discuss also the use of a special moving frame.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008). (with a foreword by Paul Van Dooren)

    Google Scholar 

  2. 2.

    Aliprantis, C.D., Border, K.C.: Infinite Dimensional Analysis. A Hitchhiker’s Guide, 3rd edn. Springer, Berlin (2006)

    Google Scholar 

  3. 3.

    Amari, S., Nagaoka, H.: Methods of information geometry. American Mathematical Society, Providence (2000). (translated from the 1993 Japanese original by Daishi Harada)

    Google Scholar 

  4. 4.

    Amari, S.I.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998). https://doi.org/10.1162/089976698300017746

    Article  Google Scholar 

  5. 5.

    Amari, S.I.: Information geometry and its applications. Appl. Math. Sci. 194 (2016). https://doi.org/10.1007/978-4-431-55978-8

    Google Scholar 

  6. 6.

    Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. Wiley Series in Probability and Statistics, 3rd edn. Wiley, Hoboken (2003)

    Google Scholar 

  7. 7.

    Bhatia, R.: Positive Definite Matrices. Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2007). ([2015] paperback edition of the 2007 original [MR2284176])

    Google Scholar 

  8. 8.

    Bhatia, R., Jain, T., Lim, Y.: On the Bures-Wasserstein distance between positive definite matrices. Expositiones Mathematicae (2018). https://doi.org/10.1016/j.exmath.2018.01.002 arXiv:1712.01504 (in press)

  9. 9.

    Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. 44(4), 375–417 (1991). https://doi.org/10.1002/cpa.3160440402

    MathSciNet  Article  MATH  Google Scholar 

  10. 10.

    do Carmo, M.P.: Riemannian geometry. Mathematics: Theory and Applications. Birkhuser Boston Inc., Cambridge (1992). (translated from the second Portuguese edition by Francis Flaherty)

    Google Scholar 

  11. 11.

    Chevallier, E., Kalunga, E., Angulo, J.: Kernel density estimation on spaces of Gaussian distributions and symmetric positive definite matrices. SIAM J. Imaging Sci. 10(1), 191–215 (2017). https://doi.org/10.1137/15M1053566

    MathSciNet  Article  MATH  Google Scholar 

  12. 12.

    Dowson, D.C., Landau, B.V.: The Fréchet distance between multivariate normal distributions. J. Multivar. Anal. 12(3), 450–455 (1982). https://doi.org/10.1016/0047-259X(82)90077-X

    Article  MATH  Google Scholar 

  13. 13.

    Gelbrich, M.: On a formula for the \(L^2\) Wasserstein metric between measures on Euclidean and Hilbert spaces. Math. Nachr. 147, 185–203 (1990). https://doi.org/10.1002/mana.19901470121

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Givens, C.R., Shortt, R.M.: A class of Wasserstein metrics for probability distributions. Michigan Math. J. 31(2), 231–240 (1984). https://doi.org/10.1307/mmj/1029003026

    MathSciNet  Article  MATH  Google Scholar 

  15. 15.

    Halmos, P.R.: Finite-dimensional vector spaces. The University Series in Undergraduate Mathematics, 2nd edn. D. Van Nostrand Co., Inc., Princeton-Toronto-New York-London (1958)

    Google Scholar 

  16. 16.

    Hyvrinen, A.: Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res. 6, 695–709 (2005)

    MathSciNet  Google Scholar 

  17. 17.

    Klingenberg, W.P.A.: Riemannian Geometry, De Gruyter Studies in Mathematics, vol. 1, 2nd edn. Walter de Gruyter & Co., Berlin (1995). https://doi.org/10.1515/9783110905120

  18. 18.

    Knott, M., Smith, C.S.: On the optimal mapping of distributions. J. Optim. Theory Appl. 43(1), 39–49 (1984). https://doi.org/10.1007/BF00934745

    MathSciNet  Article  MATH  Google Scholar 

  19. 19.

    Lafferty, J.D.: The density manifold and configuration space quantization. Trans. Am. Math. Soc. 305(2), 699–741 (1988). https://doi.org/10.2307/2000885

    MathSciNet  Article  MATH  Google Scholar 

  20. 20.

    Lang, S.: Differential and Riemannian manifolds, Graduate Texts in Mathematics, vol. 160, 3rd edn. Springer, Berlin Heidelberg (1995)

    Google Scholar 

  21. 21.

    Lott, J.: Some geometric calculations on Wasserstein space. Comm. Math. Phys. 277(2), 423–437 (2008). https://doi.org/10.1007/s00220-007-0367-3

    MathSciNet  Article  MATH  Google Scholar 

  22. 22.

    Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley Series in Probability and Statistics. Wiley, Chichester (1999). (Revised reprint of the 1988 original)

    Google Scholar 

  23. 23.

    Malagò, L., Pistone, G.: Combinatorial optimization with information geometry: Newton method. Entropy 16, 4260–4289 (2014)

    MathSciNet  Article  Google Scholar 

  24. 24.

    Malagò, L., Pistone, G.: Information geometry of the Gaussiandistributionin view of stochastic optimization. In: Proceedings of FOGA’15, held on January 17-20, 2015, Aberystwyth,Wales, 2015 (2015)

  25. 25.

    Mangasarian, O.L., Fromovitz, S.: The Fritz John necessary optimality conditions in the presence of equality and inequality constraints. J. Math. Anal. Appl. 17, 37–47 (1967). https://doi.org/10.1016/0022-247X(67)90163-1

    MathSciNet  Article  MATH  Google Scholar 

  26. 26.

    McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997). https://doi.org/10.1006/aima.1997.1634

    MathSciNet  Article  MATH  Google Scholar 

  27. 27.

    McCann, R.J.: Polar factorization of maps on Riemannian manifolds. Geom. Funct. Anal. 11(3), 589–608 (2001). https://doi.org/10.1007/PL00001679

    MathSciNet  Article  MATH  Google Scholar 

  28. 28.

    Olkin, I., Pukelsheim, F.: The distance between two random vectors with given dispersion matrices. Linear Algebra Appl. 48, 257–263 (1982). https://doi.org/10.1016/0024-3795(82)90112-4

    MathSciNet  Article  MATH  Google Scholar 

  29. 29.

    Otto, F.: The geometry of dissipative evolution equations: the porous medium equation. Comm. Partial Differential Equations 26(1-2), 101–174 (2001)

    MathSciNet  Article  Google Scholar 

  30. 30.

    Papadopoulos, A.: Metric spaces, convexity and non-positive curvature, IRMA Lectures in Mathematics and Theoretical Physics, vol. 6, 2nd edn. European Mathematical Society (EMS), Zürich (2014). https://doi.org/10.4171/132

  31. 31.

    Parry, M., Dawid, A.P., Lauritzen, S.: Proper local scoring rules. Ann. Stat. 40(1), 561–592 (2012). https://doi.org/10.1214/12-AOS971

    MathSciNet  Article  MATH  Google Scholar 

  32. 32.

    Pistone, G.: Nonparametric information geometry. In: F. Nielsen, F. Barbaresco (eds.) Geometric Science of Information, Lecture Notes in Comput. Sci., vol. 8085, pp. 5–36. Springer, Heidelberg (2013). First International Conference, GSI 2013 Paris, France, August 28-30 (2013) (proceedings)

  33. 33.

    Pistone, G., Sempi, C.: An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann. Stat. 23(5), 1543–1561 (1995)

    MathSciNet  Article  Google Scholar 

  34. 34.

    Simoncini, V.: Computational methods for linear matrix equations. SIAM Rev. 58(3), 377–441 (2016). https://doi.org/10.1137/130912839

    MathSciNet  Article  MATH  Google Scholar 

  35. 35.

    Skovgaard, L.T.: A Riemannian geometry of the multivariate normal model. Scand. J. Stat. 11(4), 211–223 (1984)

    MathSciNet  MATH  Google Scholar 

  36. 36.

    Takatsu, A.: Wasserstein geometry of Gaussian measures. Osaka J. Math. 48(4), 1005–1026 (2011)

    MathSciNet  MATH  Google Scholar 

  37. 37.

    Villani, C.: Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften. Springer, Berlin Heidelberg (2008)

    Google Scholar 

  38. 38.

    Wachspress, E.L.: Trail to a Lyapunov equation solver. Comput. Math. Appl. 55(8), 1653–1659 (2008). https://doi.org/10.1016/j.camwa.2007.04.048

    MathSciNet  Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors wish to thank two anonymous referees for helpful comments. G. Pistone acknowledges the support of de Castro Statistics and Collegio Carlo Alberto. He is a member of GNAMPA-INdAM.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Giovanni Pistone.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Malagò, L., Montrucchio, L. & Pistone, G. Wasserstein Riemannian geometry of Gaussian densities. Info. Geo. 1, 137–179 (2018). https://doi.org/10.1007/s41884-018-0014-4

Download citation

Keywords

  • Information geometry
  • Gaussian distribution
  • Wasserstein distance
  • Riemannian metrics
  • Natural gradient
  • Riemannian exponential
  • Normal coordinates
  • Levi-Civita covariant derivative
  • Optimization on positive-definite symmetric matrices

Mathematics Subject Classification

  • 15B48
  • 53C23
  • 53C25
  • 60D05