Skip to main content
Log in

Procrustes Metrics on Covariance Operators and Optimal Transportation of Gaussian Processes

  • Published:
Sankhya A Aims and scope Submit manuscript

Abstract

Covariance operators are fundamental in functional data analysis, providing the canonical means to analyse functional variation via the celebrated Karhunen–Loève expansion. These operators may themselves be subject to variation, for instance in contexts where multiple functional populations are to be compared. Statistical techniques to analyse such variation are intimately linked with the choice of metric on covariance operators, and the intrinsic infinite-dimensionality of these operators. In this paper, we describe the manifold-like geometry of the space of trace-class infinite-dimensional covariance operators and associated key statistical properties, under the recently proposed infinite-dimensional version of the Procrustes metric (Pigoli et al. Biometrika101, 409–422, 2014). We identify this space with that of centred Gaussian processes equipped with the Wasserstein metric of optimal transportation. The identification allows us to provide a detailed description of those aspects of this manifold-like geometry that are important in terms of statistical inference; to establish key properties of the Fréchet mean of a random sample of covariances; and to define generative models that are canonical for such metrics and link with the problem of registration of warped functional data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agueh, M. and Carlier, G. (2011). Barycenters in the Wasserstein space. Soc. Ind. Appl. Math. 43, 904–924.

    MathSciNet  MATH  Google Scholar 

  • Alexander, D.C. (2005). Multiple-fiber reconstruction algorithms for diffusion MRI. Ann. N. Y. Acad. Sci. 1064, 113–133.

    Article  Google Scholar 

  • Álvarez-Esteban, P., Del Barrio, E., Cuesta-Albertos, J., Matrán, C. et al. (2011). Uniqueness and approximate computation of optimal incomplete transportation plans. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 47, 358–375.

    Article  MathSciNet  MATH  Google Scholar 

  • Álvarez-Esteban, P.C., del Barrio, E., Cuesta-Albertos, J. and Matrán, C. (2016). A fixed-point approach to barycenters in Wasserstein space. J. Math. Anal. Appl. 441, 744–762.

    Article  MathSciNet  MATH  Google Scholar 

  • Ambrosio, L. and Gigli, N. (2013). A User’s Guide to Optimal Transport. In Modelling and Optimisation of Flows on Networks. Springer, pp. 1–155.

  • Ambrosio, L., Gigli, N. and Savaré, G. (2008). Gradient Flows: in Metric Spaces and in the Space of Probability Measures. Springer Science & Business Media, Berlin.

    MATH  Google Scholar 

  • Benko, M., Härdle, W. and Kneip, A. (2009). Common functional principal components. Ann. Statist. 37, 1–34.

    Article  MathSciNet  MATH  Google Scholar 

  • Bhatia, R., Jain, T. and Lim, Y. (2018). On the Bures-Wasserstein distance between positive definite matrices. Expo. Math. https://doi.org/10.1016/j.exmath.2018.01.002.

  • Bhattacharya, R. and Patrangenaru, V. (2003). Large sample theory of intrinsic and extrinsic sample means on manifolds: i. Ann. Statist. 31, 1–29.

    Article  MathSciNet  MATH  Google Scholar 

  • Bhattacharya, R. and Patrangenaru, V. (2005). Large sample theory of intrinsic and extrinsic sample means on manifolds: ii. Ann. Statist. 33, 1225–1259.

    Article  MathSciNet  MATH  Google Scholar 

  • Bigot, J. and Klein, T. (2012). Characterization of barycenters in the Wasserstein space by averaging optimal transport maps. arXiv:1212.2562.

  • Bogachev, V. I. (1998). Gaussian Measures, vol. 62. American Mathematical Society, Providence.

    Book  Google Scholar 

  • Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. 44, 375–417.

    Article  MathSciNet  MATH  Google Scholar 

  • Coffey, N., Harrison, A., Donoghue, O. and Hayes, K. (2011). Common functional principal components analysis: a new approach to analyzing human movement data. Hum. Mov. Sci. 30, 1144–1166.

    Article  Google Scholar 

  • Cuesta-Albertos, J., Matrán-Bea, C. and Tuero-Diaz, A. (1996). On lower bounds for the l 2-Wasserstein metric in a Hilbert space. J. Theoret. Probab. 9, 263–283.

    Article  MathSciNet  MATH  Google Scholar 

  • Cuesta-Albertos, J. A. and Matrán, C. (1989). Notes on the Wasserstein metric in Hilbert spaces. Ann. Probab. 17, 1264–1276.

    Article  MathSciNet  MATH  Google Scholar 

  • Descary, M.-H. and Panaretos, V.M. (2016). Functional data analysis by matrix completion. Ann. Stat. arXiv:1609.00834.

  • Dryden, I. and Mardia, K. (1998). Statistical Analysis of Shape. Wiley, New York.

    MATH  Google Scholar 

  • Dryden, I.L., Koloydenko, A. and Zhou, D. (2009). Non-euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. Ann. Appl. Stat. 3, 1102–1123.

    Article  MathSciNet  MATH  Google Scholar 

  • Durrett, R. (2010). Probability: Theory and Examples. Cambridge University Press, Cambridge.

    Book  MATH  Google Scholar 

  • Fletcher, P.T., Lu, C., Pizer, S.M. and Joshi, S. (2004). Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Trans. Med. Imaging 23, 995–1005.

    Article  Google Scholar 

  • Fréchet, M. (1948). Les éléments aléatoires de nature quelconque dans un espace distancié. Ann. Inst. H. Poincaré, 10, 215–310.

    MathSciNet  MATH  Google Scholar 

  • Fremdt, S., Steinebach, J.G., Horváth, L. and Kokoszka, P. (2013). Testing the equality of covariance operators in functional samples. Scand. J. Stat. 40, 138–152.

    Article  MathSciNet  MATH  Google Scholar 

  • Gabrys, R., Horváth, L. and Kokoszka, P. (2010). Tests for error correlation in the functional linear model. J. Amer. Statist. Assoc. 105, 1113–1125.

    Article  MathSciNet  MATH  Google Scholar 

  • Gangbo, W. and Świȩch, A. (1998). Optimal maps for the multidimensional Monge–Kantorovich problem. Comm. Pure Appl. Math. 51, 23–45.

    Article  MathSciNet  MATH  Google Scholar 

  • Gower, J.C. (1975). Generalized Procrustes analysis. Psychometrika 40, 33–51.

    Article  MathSciNet  MATH  Google Scholar 

  • Horváth, L., Hušková, M. and Rice, G. (2013). Test of independence for functional data. J. Multivariate Anal. 117, 100–119.

    Article  MathSciNet  MATH  Google Scholar 

  • Horváth, L. and Kokoszka, P. (2012). Inference for Functional Data with Applications, vol. 200. Springer Science & Business Media, Berlin.

    Book  MATH  Google Scholar 

  • Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley, New York.

    Book  MATH  Google Scholar 

  • Huckemann, S., Hotz, T. and Munk, A. (2010). Intrinsic shape analysis: geodesic pca for Riemannian manifolds modulo isometric lie group actions. Statist. Sinica 20, 1–58.

    MathSciNet  MATH  Google Scholar 

  • Jarušková, D. (2013). Testing for a change in covariance operator. J. Statist. Plann. Inference 143, 1500–1511.

    Article  MathSciNet  MATH  Google Scholar 

  • Jolliffe, I.T. (2002). Principal component analysis. Springer, New York.

    MATH  Google Scholar 

  • Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Comm. Pure Appl. Math. 30, 509–541.

    Article  MathSciNet  MATH  Google Scholar 

  • Knott, M. and Smith, C.S. (1984). On the optimal mapping of distributions. J. Optim. Theory Appl. 43, 39–49.

    Article  MathSciNet  MATH  Google Scholar 

  • Kraus, D. (2015). Components and completion of partially observed functional data. J. R. Stat. Soc. Ser. B Stat Methodol. 77, 777–801.

    Article  MathSciNet  MATH  Google Scholar 

  • Kraus, D. and Panaretos, V.M. (2012). Dispersion operators and resistant second-order functional data analysis. Biometrika 99, 813–832.

    Article  MATH  Google Scholar 

  • Le Gouic, T. and Loubes, J.-M. (2017). Existence and consistency of Wasserstein barycenters. Probab. Theory Relat. Fields 168, 1–17.

    Article  MathSciNet  MATH  Google Scholar 

  • McCann, R.J. (1997). A convexity principle for interacting gases. Adv. Math. 128, 153–179.

    Article  MathSciNet  MATH  Google Scholar 

  • Olkin, I. and Pukelsheim, F. (1982). The distance between two random vectors with given dispersion matrices. Linear Algebra Appl. 48, 257–263.

    Article  MathSciNet  MATH  Google Scholar 

  • Panaretos, V.M., Kraus, D. and Maddocks, J.H. (2010). Second-order comparison of gaussian random functions and the geometry of dna minicircles. J. Amer. Statist. Assoc. 105, 670–682.

    Article  MathSciNet  MATH  Google Scholar 

  • Panaretos, V.M. and Tavakoli, S. (2013). Cramér–Karhunen–Loève representation and harmonic principal component analysis of functional time series. Stochastic Process. Appl. 123, 2779–2807.

    Article  MathSciNet  MATH  Google Scholar 

  • Panaretos, V.M. and Zemel, Y. (2016). Amplitude and phase variation of point processes. Ann. Statist. 44, 771–812.

    Article  MathSciNet  MATH  Google Scholar 

  • Panaretos, V.M. and Zemel, Y. (2018). Introduction to statistics in the Wasserstein space. Springer Briefs in Probability and Mathematical Statistics. To appear.

  • Paparoditis, E. and Sapatinas, T. (2014). Bootstrap-based testing for functional data. arXiv:1409.4317.

  • Pigoli, D., Aston, J.A., Dryden, I.L. and Secchi, P. (2014). Distances and inference for covariance operators. Biometrika 101, 409–422.

    Article  MathSciNet  MATH  Google Scholar 

  • Ramsay, J. and Silverman, B. (2005). Springer Series in Statistics.

  • Rippl, T., Munk, A. and Sturm, A. (2016). Limit laws of the empirical Wasserstein distance: Gaussian distributions. J. Multivar. Anal. 151, 90–109.

    Article  MathSciNet  MATH  Google Scholar 

  • Rüschendorf, L. and Rachev, S.T. (1990). A characterization of random variables with minimum L 2-distance. J. Multivar. Anal. 32, 48–54.

    Article  MATH  Google Scholar 

  • Rüschendorf, L. and Uckelmann, L. (2002). On the n-coupling problem. J. Multivariate Anal. 81, 242–258.

    Article  MathSciNet  MATH  Google Scholar 

  • Schwartzman, A. (2006). Random Ellipsoids and False Discovery Rates: Statistics for Diffusion Tensor Imaging Data. PhD thesis, Stanford University.

  • Schwartzman, A., Dougherty, R.F. and Taylor, J.E. (2008). False discovery rate analysis of brain diffusion direction maps. Ann. Appl. Stat. 2, 153–175.

    Article  MathSciNet  MATH  Google Scholar 

  • Stein, E.M. and Shakarchi, R. (2009). Real Analysis: Measure Theory, Integration, and Hilbert Spaces. Princeton University Press, Princeton.

    Book  MATH  Google Scholar 

  • Takatsu, A. (2011). Wasserstein geometry of Gaussian measures. Osaka J. Math. 48, 1005–1026.

    MathSciNet  MATH  Google Scholar 

  • Tavakoli, S. and Panaretos, V.M. (2016). Detecting and localizing differences in functional time series dynamics: a case study in molecular biophysics. J. Amer. Statist. Assoc. 111, 1020–1035.

    Article  MathSciNet  Google Scholar 

  • van der Vaart, A.W. and Wellner, J.A. (1996). Weak Convergence and Empirical Processes. Springer, Berlin.

    Book  MATH  Google Scholar 

  • Villani, C. (2003). Topics in Optimal Transportation, vol. 58. American Mathematical Society, Providence.

    MATH  Google Scholar 

  • von Renesse, M.-K. and Sturm, K.-T. (2009). Entropic measure and Wasserstein diffusion. Ann. Probab. 37, 1114–1191.

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, J.-L., Chiou, J.-M. and Müller, H.-G. (2016). Functional data analysis. Annual Review of Statistics and Its Application 3, 257–295.

    Article  Google Scholar 

  • Yao, F., Müller, H.-G. and Wang, J.-L. (2005a). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100, 577–590.

    Article  MathSciNet  MATH  Google Scholar 

  • Yao, F., Müller, H.-G., Wang, J.-L. et al. (2005b). Functional linear regression analysis for longitudinal data. Ann. Statist. 33, 2873–2903.

    Article  MathSciNet  MATH  Google Scholar 

  • Zemel, Y. (2017). Fréchet Means in Wasserstein Space: Theory and Algorithms. PhD thesis, École Polytechnique Fédérale de Lausanne.

  • Zemel, Y. and Panaretos, V.M. (2017). Fréchet means and Procrustes analysis in Wasserstein space. Bernoulli (to appear), available on arXiv:1701.06876.

  • Zhang, J. (2013). Analysis of Variance for Functional Data. Monographs on statistics and applied probability. Chapman & Hall, London.

    Book  Google Scholar 

  • Ziezold, H. (1977). On Expected Figures and a Strong Law of Large Numbers for Random Elements in Quasi-Metric Spaces. In Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the 1974 European Meeting of Statisticians. Springer, pp. 591–602.

Download references

Acknowledgements

We wish to warmly thank a reviewer for providing constructive and insightful comments that led to genuine improvements in our presentation. This research is supported in part by a Swiss National Science Foundation grant to V. M. Panaretos.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victor M. Panaretos.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Masarotto, V., Panaretos, V.M. & Zemel, Y. Procrustes Metrics on Covariance Operators and Optimal Transportation of Gaussian Processes. Sankhya A 81, 172–213 (2019). https://doi.org/10.1007/s13171-018-0130-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-018-0130-1

Keywords and phrases.

AMS (2000) subject classification.

Navigation