Sankhya A

pp 1–42 | Cite as

Procrustes Metrics on Covariance Operators and Optimal Transportation of Gaussian Processes

  • Valentina Masarotto
  • Victor M. PanaretosEmail author
  • Yoav Zemel


Covariance operators are fundamental in functional data analysis, providing the canonical means to analyse functional variation via the celebrated Karhunen–Loève expansion. These operators may themselves be subject to variation, for instance in contexts where multiple functional populations are to be compared. Statistical techniques to analyse such variation are intimately linked with the choice of metric on covariance operators, and the intrinsic infinite-dimensionality of these operators. In this paper, we describe the manifold-like geometry of the space of trace-class infinite-dimensional covariance operators and associated key statistical properties, under the recently proposed infinite-dimensional version of the Procrustes metric (Pigoli et al. Biometrika 101, 409–422, 2014). We identify this space with that of centred Gaussian processes equipped with the Wasserstein metric of optimal transportation. The identification allows us to provide a detailed description of those aspects of this manifold-like geometry that are important in terms of statistical inference; to establish key properties of the Fréchet mean of a random sample of covariances; and to define generative models that are canonical for such metrics and link with the problem of registration of warped functional data.

Keywords and phrases.

Functional data analysis Fréchet mean Manifold statistics Optimal coupling Tangent space PCA Trace-class operator. 

AMS (2000) subject classification.

Primary 60G15 Gaussian processes 60D05 Geometric probability and stochastic geometry Secondary 60H25 Random operators and equations 62M99 None of the above but in this section. 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



We wish to warmly thank a reviewer for providing constructive and insightful comments that led to genuine improvements in our presentation. This research is supported in part by a Swiss National Science Foundation grant to V. M. Panaretos.


  1. Agueh, M. and Carlier, G. (2011). Barycenters in the Wasserstein space. Soc. Ind. Appl. Math. 43, 904–924.MathSciNetzbMATHGoogle Scholar
  2. Alexander, D.C. (2005). Multiple-fiber reconstruction algorithms for diffusion MRI. Ann. N. Y. Acad. Sci. 1064, 113–133.CrossRefGoogle Scholar
  3. Álvarez-Esteban, P., Del Barrio, E., Cuesta-Albertos, J., Matrán, C. et al. (2011). Uniqueness and approximate computation of optimal incomplete transportation plans. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 47, 358–375.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Álvarez-Esteban, P.C., del Barrio, E., Cuesta-Albertos, J. and Matrán, C. (2016). A fixed-point approach to barycenters in Wasserstein space. J. Math. Anal. Appl. 441, 744–762.MathSciNetCrossRefzbMATHGoogle Scholar
  5. Ambrosio, L. and Gigli, N. (2013). A User’s Guide to Optimal Transport. In Modelling and Optimisation of Flows on Networks. Springer, pp. 1–155.Google Scholar
  6. Ambrosio, L., Gigli, N. and Savaré, G. (2008). Gradient Flows: in Metric Spaces and in the Space of Probability Measures. Springer Science & Business Media, Berlin.zbMATHGoogle Scholar
  7. Benko, M., Härdle, W. and Kneip, A. (2009). Common functional principal components. Ann. Statist. 37, 1–34.MathSciNetCrossRefzbMATHGoogle Scholar
  8. Bhatia, R., Jain, T. and Lim, Y. (2018). On the Bures-Wasserstein distance between positive definite matrices. Expo. Math.
  9. Bhattacharya, R. and Patrangenaru, V. (2003). Large sample theory of intrinsic and extrinsic sample means on manifolds: i. Ann. Statist. 31, 1–29.MathSciNetCrossRefzbMATHGoogle Scholar
  10. Bhattacharya, R. and Patrangenaru, V. (2005). Large sample theory of intrinsic and extrinsic sample means on manifolds: ii. Ann. Statist. 33, 1225–1259.MathSciNetCrossRefzbMATHGoogle Scholar
  11. Bigot, J. and Klein, T. (2012). Characterization of barycenters in the Wasserstein space by averaging optimal transport maps. arXiv:1212.2562.
  12. Bogachev, V. I. (1998). Gaussian Measures, vol. 62. American Mathematical Society, Providence.CrossRefGoogle Scholar
  13. Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. 44, 375–417.MathSciNetCrossRefzbMATHGoogle Scholar
  14. Coffey, N., Harrison, A., Donoghue, O. and Hayes, K. (2011). Common functional principal components analysis: a new approach to analyzing human movement data. Hum. Mov. Sci. 30, 1144–1166.CrossRefGoogle Scholar
  15. Cuesta-Albertos, J., Matrán-Bea, C. and Tuero-Diaz, A. (1996). On lower bounds for the l 2-Wasserstein metric in a Hilbert space. J. Theoret. Probab. 9, 263–283.MathSciNetCrossRefzbMATHGoogle Scholar
  16. Cuesta-Albertos, J. A. and Matrán, C. (1989). Notes on the Wasserstein metric in Hilbert spaces. Ann. Probab. 17, 1264–1276.MathSciNetCrossRefzbMATHGoogle Scholar
  17. Descary, M.-H. and Panaretos, V.M. (2016). Functional data analysis by matrix completion. Ann. Stat. arXiv:1609.00834.
  18. Dryden, I. and Mardia, K. (1998). Statistical Analysis of Shape. Wiley, New York.zbMATHGoogle Scholar
  19. Dryden, I.L., Koloydenko, A. and Zhou, D. (2009). Non-euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. Ann. Appl. Stat. 3, 1102–1123.MathSciNetCrossRefzbMATHGoogle Scholar
  20. Durrett, R. (2010). Probability: Theory and Examples. Cambridge University Press, Cambridge.CrossRefzbMATHGoogle Scholar
  21. Fletcher, P.T., Lu, C., Pizer, S.M. and Joshi, S. (2004). Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Trans. Med. Imaging 23, 995–1005.CrossRefGoogle Scholar
  22. Fréchet, M. (1948). Les éléments aléatoires de nature quelconque dans un espace distancié. Ann. Inst. H. Poincaré, 10, 215–310.MathSciNetzbMATHGoogle Scholar
  23. Fremdt, S., Steinebach, J.G., Horváth, L. and Kokoszka, P. (2013). Testing the equality of covariance operators in functional samples. Scand. J. Stat. 40, 138–152.MathSciNetCrossRefzbMATHGoogle Scholar
  24. Gabrys, R., Horváth, L. and Kokoszka, P. (2010). Tests for error correlation in the functional linear model. J. Amer. Statist. Assoc. 105, 1113–1125.MathSciNetCrossRefzbMATHGoogle Scholar
  25. Gangbo, W. and Świȩch, A. (1998). Optimal maps for the multidimensional Monge–Kantorovich problem. Comm. Pure Appl. Math. 51, 23–45.MathSciNetCrossRefzbMATHGoogle Scholar
  26. Gower, J.C. (1975). Generalized Procrustes analysis. Psychometrika 40, 33–51.MathSciNetCrossRefzbMATHGoogle Scholar
  27. Horváth, L., Hušková, M. and Rice, G. (2013). Test of independence for functional data. J. Multivariate Anal. 117, 100–119.MathSciNetCrossRefzbMATHGoogle Scholar
  28. Horváth, L. and Kokoszka, P. (2012). Inference for Functional Data with Applications, vol. 200. Springer Science & Business Media, Berlin.CrossRefzbMATHGoogle Scholar
  29. Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley, New York.CrossRefzbMATHGoogle Scholar
  30. Huckemann, S., Hotz, T. and Munk, A. (2010). Intrinsic shape analysis: geodesic pca for Riemannian manifolds modulo isometric lie group actions. Statist. Sinica 20, 1–58.MathSciNetzbMATHGoogle Scholar
  31. Jarušková, D. (2013). Testing for a change in covariance operator. J. Statist. Plann. Inference 143, 1500–1511.MathSciNetCrossRefzbMATHGoogle Scholar
  32. Jolliffe, I.T. (2002). Principal component analysis. Springer, New York.zbMATHGoogle Scholar
  33. Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Comm. Pure Appl. Math. 30, 509–541.MathSciNetCrossRefzbMATHGoogle Scholar
  34. Knott, M. and Smith, C.S. (1984). On the optimal mapping of distributions. J. Optim. Theory Appl. 43, 39–49.MathSciNetCrossRefzbMATHGoogle Scholar
  35. Kraus, D. (2015). Components and completion of partially observed functional data. J. R. Stat. Soc. Ser. B Stat Methodol. 77, 777–801.MathSciNetCrossRefGoogle Scholar
  36. Kraus, D. and Panaretos, V.M. (2012). Dispersion operators and resistant second-order functional data analysis. Biometrika 99, 813–832.CrossRefzbMATHGoogle Scholar
  37. Le Gouic, T. and Loubes, J.-M. (2017). Existence and consistency of Wasserstein barycenters. Probab. Theory Relat. Fields 168, 1–17.MathSciNetCrossRefzbMATHGoogle Scholar
  38. McCann, R.J. (1997). A convexity principle for interacting gases. Adv. Math. 128, 153–179.MathSciNetCrossRefzbMATHGoogle Scholar
  39. Olkin, I. and Pukelsheim, F. (1982). The distance between two random vectors with given dispersion matrices. Linear Algebra Appl. 48, 257–263.MathSciNetCrossRefzbMATHGoogle Scholar
  40. Panaretos, V.M., Kraus, D. and Maddocks, J.H. (2010). Second-order comparison of gaussian random functions and the geometry of dna minicircles. J. Amer. Statist. Assoc. 105, 670–682.MathSciNetCrossRefzbMATHGoogle Scholar
  41. Panaretos, V.M. and Tavakoli, S. (2013). Cramér–Karhunen–Loève representation and harmonic principal component analysis of functional time series. Stochastic Process. Appl. 123, 2779–2807.MathSciNetCrossRefzbMATHGoogle Scholar
  42. Panaretos, V.M. and Zemel, Y. (2016). Amplitude and phase variation of point processes. Ann. Statist. 44, 771–812.MathSciNetCrossRefzbMATHGoogle Scholar
  43. Panaretos, V.M. and Zemel, Y. (2018). Introduction to statistics in the Wasserstein space. Springer Briefs in Probability and Mathematical Statistics. To appear.Google Scholar
  44. Paparoditis, E. and Sapatinas, T. (2014). Bootstrap-based testing for functional data. arXiv:1409.4317.
  45. Pigoli, D., Aston, J.A., Dryden, I.L. and Secchi, P. (2014). Distances and inference for covariance operators. Biometrika 101, 409–422.MathSciNetCrossRefzbMATHGoogle Scholar
  46. Ramsay, J. and Silverman, B. (2005). Springer Series in Statistics.Google Scholar
  47. Rippl, T., Munk, A. and Sturm, A. (2016). Limit laws of the empirical Wasserstein distance: Gaussian distributions. J. Multivar. Anal. 151, 90–109.MathSciNetCrossRefzbMATHGoogle Scholar
  48. Rüschendorf, L. and Rachev, S.T. (1990). A characterization of random variables with minimum L 2-distance. J. Multivar. Anal. 32, 48–54.CrossRefzbMATHGoogle Scholar
  49. Rüschendorf, L. and Uckelmann, L. (2002). On the n-coupling problem. J. Multivariate Anal. 81, 242–258.MathSciNetCrossRefzbMATHGoogle Scholar
  50. Schwartzman, A. (2006). Random Ellipsoids and False Discovery Rates: Statistics for Diffusion Tensor Imaging Data. PhD thesis, Stanford University.Google Scholar
  51. Schwartzman, A., Dougherty, R.F. and Taylor, J.E. (2008). False discovery rate analysis of brain diffusion direction maps. Ann. Appl. Stat. 2, 153–175.MathSciNetCrossRefzbMATHGoogle Scholar
  52. Stein, E.M. and Shakarchi, R. (2009). Real Analysis: Measure Theory, Integration, and Hilbert Spaces. Princeton University Press, Princeton.zbMATHGoogle Scholar
  53. Takatsu, A. (2011). Wasserstein geometry of Gaussian measures. Osaka J. Math. 48, 1005–1026.MathSciNetzbMATHGoogle Scholar
  54. Tavakoli, S. and Panaretos, V.M. (2016). Detecting and localizing differences in functional time series dynamics: a case study in molecular biophysics. J. Amer. Statist. Assoc. 111, 1020–1035.MathSciNetCrossRefGoogle Scholar
  55. van der Vaart, A.W. and Wellner, J.A. (1996). Weak Convergence and Empirical Processes. Springer, Berlin.CrossRefzbMATHGoogle Scholar
  56. Villani, C. (2003). Topics in Optimal Transportation, vol. 58. American Mathematical Society, Providence.zbMATHGoogle Scholar
  57. von Renesse, M.-K. and Sturm, K.-T. (2009). Entropic measure and Wasserstein diffusion. Ann. Probab. 37, 1114–1191.MathSciNetCrossRefzbMATHGoogle Scholar
  58. Wang, J.-L., Chiou, J.-M. and Müller, H.-G. (2016). Functional data analysis. Annual Review of Statistics and Its Application 3, 257–295.CrossRefGoogle Scholar
  59. Yao, F., Müller, H.-G. and Wang, J.-L. (2005a). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100, 577–590.MathSciNetCrossRefzbMATHGoogle Scholar
  60. Yao, F., Müller, H.-G., Wang, J.-L. et al. (2005b). Functional linear regression analysis for longitudinal data. Ann. Statist. 33, 2873–2903.MathSciNetCrossRefzbMATHGoogle Scholar
  61. Zemel, Y. (2017). Fréchet Means in Wasserstein Space: Theory and Algorithms. PhD thesis, École Polytechnique Fédérale de Lausanne.Google Scholar
  62. Zemel, Y. and Panaretos, V.M. (2017). Fréchet means and Procrustes analysis in Wasserstein space. Bernoulli (to appear), available on arXiv:1701.06876.
  63. Zhang, J. (2013). Analysis of Variance for Functional Data. Monographs on statistics and applied probability. Chapman & Hall, London.Google Scholar
  64. Ziezold, H. (1977). On Expected Figures and a Strong Law of Large Numbers for Random Elements in Quasi-Metric Spaces. In Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the 1974 European Meeting of Statisticians. Springer, pp. 591–602.Google Scholar

Copyright information

© Indian Statistical Institute 2018

Authors and Affiliations

  • Valentina Masarotto
    • 1
  • Victor M. Panaretos
    • 1
    Email author
  • Yoav Zemel
    • 1
  1. 1.Institut de MathématiquesEcole Polytechnique Fédérale de LausanneLausanneSwitzerland

Personalised recommendations