Skip to main content

The Statistics of Circular Optimal Transport

  • Chapter
  • First Online:
Directional Statistics for Innovative Applications

Abstract

Empirical optimal transport (OT) plans and distances provide effective tools to compare and statistically match probability measures defined on a given ground space. Fundamental to this are distributional limit laws, and we derive a central limit theorem for the empirical OT distance of circular data. Our limit results require only mild assumptions in general and include prominent examples such as the von Mises or wrapped Cauchy family. Most notably, no assumptions are required when data are sampled from the probability measure to be compared with, which is in strict contrast to the real line. A bootstrap principle follows immediately as our proof relies on Hadamard differentiability of the OT functional. This paves the way for a variety of statistical inference tasks and is exemplified for asymptotic OT-based goodness of fit testing for circular distributions. We discuss numerical implementation, consistency and investigate its statistical power. For testing uniformity, it turns out that this approach performs particularly well for unimodal alternatives and is almost as powerful as Rayleigh’s test, the most powerful invariant test for von Mises alternatives. For regimes with many modes, the circular OT test is less powerful which is explained by the shape of the corresponding transport plan.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A function \(g:\mathbb {R}\rightarrow \mathbb {R}\) is called coercive if \(g(x)\rightarrow \infty \) as \(|x| \rightarrow \infty \).

References

  1. Agostinelli, C., Lund, U.: R package circular: Circular Statistics (version 0.4-93). CA: Department of Environmental Sciences, Informatics and Statistics, Ca’ Foscari University, Venice, Italy. UL: Department of Statistics, California Polytechnic State University, San Luis Obispo, California, USA (2017). URL https://r-forge.r-project.org/projects/circular/

  2. Altschuler, J., Niles-Weed, J., Rigollet, P.: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In: Von Luxburg, U., Guyon, I.M., et al. (eds.) Advances in Neural Information Processing Systems, pp. 1964–1974 (2017)

    Google Scholar 

  3. Anevski, D., Fougères, A.L.: Limit properties of the monotone rearrangement for density and regression function estimation. Bernoulli 25(1), 549–583 (2019)

    Article  MathSciNet  Google Scholar 

  4. Bak, J., Newman, D.J.: Complex Analysis (3rd edn). Undergraduate Texts in Mathematics. Springer, Berlin

    Google Scholar 

  5. Batschelet, E.: Circular Statistics in Biology. Academic Press, New York (1981)

    MATH  Google Scholar 

  6. Bergin, T.M.: A comparison of goodness-of-fit tests for analysis of nest orientation in western kingbirds (Tyrannus verticalis). The Condor 93(1), 164–171 (1991)

    Article  Google Scholar 

  7. Berthet, P., Fort, J.C.: Exact rate of convergence of the expected \(\cal{W}_2 \) distance between the empirical and true gaussian distribution. Electronic J. Prob. 25, 1–16 (2020)

    Article  Google Scholar 

  8. Billingsley, P.: Convergence of Probability Measures. Wiley Series in Probability and Statistics. Wiley (1999)

    Google Scholar 

  9. Bivens, I.C., Klein, B.G.: The median value of a continuous function. Math. Mag. 88(1), 39–51 (2015)

    Article  MathSciNet  Google Scholar 

  10. Bobkov, S., Ledoux, M.: One-dimensional empirical measures, order statistics, and Kantorovich transport distances. Memoirs of the American Mathematical Society. American Mathematical Society (2019)

    Google Scholar 

  11. Chernozhukov, V., Fernández-Val, I., Galichon, A.: Quantile and probability curves without crossing. Econometrica 78(3), 1093–1125 (2010)

    Article  MathSciNet  Google Scholar 

  12. Cuturi, M.: Sinkhorn distances: Lightspeed computation of optimal transport. In: Burges, C.J.C., Bottou, L., et al. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2292–2300 (2013)

    Google Scholar 

  13. Del Barrio, E., Cuesta-Albertos, J.A., Matrán, C.: Contributions of empirical and quantile processes to the asymptotic theory of goodness-of-fit tests. Test 9(1), 1–96 (2000)

    Article  MathSciNet  Google Scholar 

  14. Del Barrio, E., Cuesta-Albertos, J.A., Matrán, C., Rodríguez-Rodríguez, J.M.: Tests of goodness of fit based on the \(L_2\)-Wasserstein distance. Ann. Stat. 27(4), 1230–1239 (1999)

    MATH  Google Scholar 

  15. Del Barrio, E., Giné, E., Matrán, C.: Central limit theorems for the Wasserstein distance between the empirical and the true distributions. Ann. Prob. 27(2), 1009–1071 (1999)

    MathSciNet  MATH  Google Scholar 

  16. Del Barrio, E., Giné, E., Utzet, F.: Asymptotics for \(L_2\) functionals of the empirical quantile process, with applications to tests of fit based on weighted Wasserstein distances. Bernoulli 11(1), 131–189 (2005)

    MathSciNet  MATH  Google Scholar 

  17. Del Barrio, E., Loubes, J.M.: Central limit theorems for empirical transportation cost in general dimension. Ann. Prob. 47(2), 926–951 (2019)

    MathSciNet  MATH  Google Scholar 

  18. Delon, J., Salomon, J., Sobolevski, A.: Fast transport optimization for Monge costs on the circle. SIAM J. Appl. Math. 70(7), 2239–2258 (2010)

    Article  MathSciNet  Google Scholar 

  19. Dümbgen, L.: On nondifferentiable functions and the bootstrap. Prob. Theor. Related Fields 95(1), 125–140 (1993)

    Article  MathSciNet  Google Scholar 

  20. Dvurechensky, P., Gasnikov, A., Kroshnin, A.: Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 1367–1376 (2018)

    Google Scholar 

  21. Evans, S.N., Matsen, F.A.: The phylogenetic Kantorovich-Rubinstein metric for environmental sequence samples. J. Royal Stat. Soc.: Ser. B (Stat. Methodol.) 74(3), 569–592 (2012)

    Article  MathSciNet  Google Scholar 

  22. Fisher, N.I.: Statistical Analysis of Circular Data. Cambridge University Press (1995)

    Google Scholar 

  23. Freitag, G., Czado, C., Munk, A.: A nonparametric test for similarity of marginals—with applications to the assessment of population bioequivalence. J. Stat. Plann. Inference 137(3), 697–711 (2007)

    Article  MathSciNet  Google Scholar 

  24. García-Portugués, E., Verdebout, T.: An overview of uniformity tests on the hypersphere. arXiv preprint 1804.00286 (2018)

    Google Scholar 

  25. Hundrieser, S., Eltzner, B., Huckemann, S.F.: Finite sample smeariness of Fréchet means and application to climate. arXiv preprint 2005.02321 (2020)

    Google Scholar 

  26. Jammalamadaka, S., Sengupta, A.: Topics in Circular Statistics. Series on Multivariate Analysis. World Scientific (2001)

    Google Scholar 

  27. Kantorovich, L.: On the translocation of masses. Doklady Akademii Nauk URSS 37, 7–8 (1942)

    MathSciNet  Google Scholar 

  28. Kim, S., SenGupta, A.: A three-parameter generalized von Mises distribution. Stat. Papers 54(3), 685–693 (2013)

    Article  MathSciNet  Google Scholar 

  29. Klatt, M., Tameling, C., Munk, A.: Empirical regularized optimal transport: statistical theory and applications. SIAM J. Math. Data Sci. 2(2), 419–443 (2020)

    Article  MathSciNet  Google Scholar 

  30. Kolouri, S., Park, S.R., Thorpe, M., Slepcev, D., Rohde, G.K.: Optimal mass transport: signal processing and machine-learning applications. IEEE Signal Process. Mag. 34(4), 43–59 (2017)

    Article  Google Scholar 

  31. Kuiper, N.H.: Tests concerning random points on a circle. Koninklijke Nederlandse Akademie van Wetenschappen Proc.: Ser. A 63(1), 38–47 (1960)

    MATH  Google Scholar 

  32. Landler, L., Ruxton, G.D., Malkemper, E.P.: Circular data in biology: advice for effectively implementing statistical procedures. Behav. Ecol. Sociobiol. 72(8), 128 (2018)

    Article  Google Scholar 

  33. Landler, L., Ruxton, G.D., Malkemper, E.P.: The Hermans-Rasson test as a powerful alternative to the Rayleigh test for circular statistics in biology. BMC Ecol. 19(1), 1–8 (2019)

    Article  Google Scholar 

  34. Mardia, K.V., Jupp, P.E.: Directional Statistics. Wiley, Chichester, New York (2000)

    MATH  Google Scholar 

  35. Monge, G.: Mémoire sur la théorie des déblais et des remblais. In: Histoire de l’Académie Royale des Sciences de Paris, pp. 666–704 (1781)

    Google Scholar 

  36. Munk, A., Czado, C.: Nonparametric validation of similar distributions and assessment of goodness of fit. J. Royal Stat. Soc.: Ser. B (Stat. Methodol.) 60(1), 223–241 (1998)

    Article  MathSciNet  Google Scholar 

  37. Panaretos, V.M., Zemel, Y.: Statistical aspects of Wasserstein distances. Ann. Rev. Stat. Appl. 6, 405–431 (2019)

    Article  MathSciNet  Google Scholar 

  38. Pewsey, A., García-Portugués, E.: Recent advances in directional statistics. Test 30, 1–58 (2021)

    Article  MathSciNet  Google Scholar 

  39. Peyré, G., Cuturi, M.: Computational optimal transport: with applications to data science. Foundations Trends Mach. Learn. 11(5–6), 355–607 (2019)

    Article  Google Scholar 

  40. Pycke, J.R.: Some tests for uniformity of circular distributions powerful against multimodal alternatives. Canadian J. Stat. 38(1), 80–96 (2010)

    MathSciNet  MATH  Google Scholar 

  41. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020). https://www.R-project.org

  42. Rabin, J., Delon, J., Gousseau, Y.: Circular earth mover’s distance for the comparison of local features. In: 2008 19th International Conference on Pattern Recognition, pp. 1–4 (2008)

    Google Scholar 

  43. Rachev, S., Rüschendorf, L.: Mass transportation problems: Volume I: Theory. Probability and Its Applications. Springer, Berlin (1998)

    Google Scholar 

  44. Rachev, S., Rüschendorf, L.: Mass transportation problems: Volume II: Applications. In: Probability and Its Applications. Springer, Berlin (1998)

    Google Scholar 

  45. Rao, J.: Some Contributions to the Analysis of Circular Data. Ph.D. thesis, Indian Statistical Institute, Kolkata (1969)

    Google Scholar 

  46. Römisch, W.: Delta method, infinite dimensional. In: Kotz, S., Balakrishnan, N., et al. (eds.) Encyclopedia of Statistical Sciences. Wiley (2004)

    Google Scholar 

  47. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)

    Article  Google Scholar 

  48. Santambrogio, F.: Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling. In: Progress in Nonlinear Differential Equations and Their Applications. Springer International Publishing, Berlin (2015)

    Google Scholar 

  49. Schiebinger, G., Shu, J., Tabaka, M., Cleary, B., Subramanian, V., Solomon, A., Gould, J., Liu, S., Lin, S., Berube, P., Lee, L., Chen, J., Brumbaugh, J., Rigollet, P., Hochedlinger, K., Jaenisch, R., Regev, A., Lander, E.S.: Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell 176(4), 928-943.e22 (2019)

    Article  Google Scholar 

  50. Schmitzer, B.: A sparse multiscale algorithm for dense optimal transport. J. Math. Imaging Vis. 56(2), 238–259 (2016)

    Article  MathSciNet  Google Scholar 

  51. Schrieber, J., Schuhmacher, D., Gottschlich, C.: DOTmark—a benchmark for discrete optimal transport. IEEE Access 5, 271–282 (2017)

    Article  Google Scholar 

  52. SenGupta, A., Ugwuowo, F.I.: Asymmetric circular-linear multivariate regression models with applications to environmental data. Environ. Ecol. Stat. (13), 299–309 (2009)

    Google Scholar 

  53. Silverman, B.W.: Density Estimation for Statistics and Data Analysis, vol. 26. CRC Press (1986)

    Google Scholar 

  54. Sommerfeld, M., Munk, A.: Inference for empirical Wasserstein distances on finite spaces. J. Royal Stat. Soc.: Ser. B (Stat. Methodol.) 80(1), 219–238 (2018)

    Article  MathSciNet  Google Scholar 

  55. Stephens, M.A.: A goodness-of-fit statistic for the circle, with some comparisons. Biometrika 56(1), 161–168 (1969)

    Article  Google Scholar 

  56. Strutt, J.W.: On the resultant of a large number of vibrations of the same pitch and of arbitrary phase. London. Edinburgh Dublin Philos. Mag. J. Sci. 10(60), 73–78 (1880)

    Google Scholar 

  57. Tameling, C., Sommerfeld, M., Munk, A.: Empirical optimal transport on countable metric spaces: distributional limits and statistical applications. Ann. Appl. Prob. 29(5), 2744–2781 (2019)

    Article  MathSciNet  Google Scholar 

  58. Van der Vaart, A.W.: Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press (1998)

    Google Scholar 

  59. Van der Vaart, A.W., Wellner, J.: Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics. Springer, Berlin (1996)

    Google Scholar 

  60. Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics. American Mathematical Society (2003)

    Google Scholar 

  61. Villani, C.: Optimal Transport: Old and New. A Series of Comprehensive Studies in Mathematics. Springer, Berlin (2008)

    Google Scholar 

  62. Watson, G.S.: Goodness-of-fit tests on a circle. Biometrika 48(1 and 2), 109–114 (1961)

    Google Scholar 

  63. Watson, G.S., Williams, E.J.: On the construction of significance tests on the circle and the sphere. Biometrika 43(3/4), 344–352 (1956)

    Article  MathSciNet  Google Scholar 

  64. Weitkamp, C.A., Proksch, K., Tameling, C., Munk, A.: Gromov-Wasserstein Distance based object matching: Asymptotic Inference. arXiv preprint 2006.12287 (2020)

    Google Scholar 

  65. Werman, M., Peleg, S., Rosenfeld, A.: A distance metric for multidimensional histograms. Comput. Vis. Graph. Image Process. 32(3), 328–336 (1985)

    Article  Google Scholar 

  66. Zemel, Y., Panaretos, V.M.: Fréchet means and procrustes analysis in Wasserstein space. Bernoulli 25(2), 932–976 (2019)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge support for the DFG Research Training Group 2088 Discovering Structure in Complex Data: Statistics Meets Optimization and Inverse Problems and the DFG Cluster of Excellence 2067 Multiscale Bioimaging: From Molecular Machines to Networks of Excitable Cells. The authors would also like to thank the editor and the anonymous referees for their comments that improved the quality of this paper. In particular, credit is given to one referee for the suggestion to consider rotationally invariant distributions for the spherical OT distance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shayan Hundrieser .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hundrieser, S., Klatt, M., Munk, A. (2022). The Statistics of Circular Optimal Transport. In: SenGupta, A., Arnold, B.C. (eds) Directional Statistics for Innovative Applications. Forum for Interdisciplinary Mathematics. Springer, Singapore. https://doi.org/10.1007/978-981-19-1044-9_4

Download citation

Publish with us

Policies and ethics