Abstract
Persistent homology allows us to create topological summaries of complex data. In order to analyse these statistically, we need to choose a topological summary and a relevant metric space in which this topological summary exists. While different summaries may contain the same information (as they come from the same persistence module), they can lead to different statistical conclusions since they lie in different metric spaces. The best choice of metric will often be application-specific. In this paper we discuss distance correlation, which is a non-parametric tool for comparing data sets that can lie in completely different metric spaces. In particular we calculate the distance correlation between different choices of topological summaries. We compare some different topological summaries for a variety of random models of underlying data via the distance correlation between the samples. We also give examples of performing distance correlation between topological summaries and other scalar measures of interest, such as a paired random variable or a parameter of the random model used to generate the underlying data. This article is meant to be expository in style, and will include the definitions of standard statistical quantities in order to be accessible to non-statisticians.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We will consider only persistent homology with intervals with finite death.
- 2.
When analyzing real data, one often cones off the space at some more or less meaningful maximum filtration so as to avoid infinite intervals.
- 3.
The data was provided by the Norwegian Mapping Authority [10] under a CC-BY-4.0 license.
References
Rushil Anirudh, Vinay Venkataraman, Karthikeyan Natesan Ramamurthy, and Pavan Turaga. “A Riemannian Framework for Statistical Analysis of Topological Persistence Diagrams”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2016, pp. 68–76.
Ulrich Bauer. Ripser: efficient computation of Vietoris-Rips persistence barcodes. 2019. arXiv: 1908.02518.
Christophe Biscio and Jesper Møller. “The accumulated persistence function, a new useful functional summary statistic for topological data analysis, with a view to brain artery trees and spatial point process applications”. In: Journal of Computational and Graphical Statistics (2019), pp. 1–20.
Peter Bubenik. “Statistical topological data analysis using persistence landscapes”. In: The Journal of Machine Learning Research 16.1 (2015), pp. 77–102.
Peter Bubenik and Paweł Dłotko. “A persistence landscapes toolbox for topological statistics”. In: Journal of Symbolic Computation 78 (2017), pp. 91–114.
Mathieu Carrière, Marco Cuturi, and Steve Oudot. “Sliced Wasserstein kernel for persistence diagrams”. In: Proceedings of the 34th International Conference on Machine Learning. Vol. 70. JMLR.org. 2017, pp. 664–673.
Mathieu Carrière, Steve Y Oudot, and Maks Ovsjanikov. “Stable topological signatures for points on 3d shapes”. In: Computer Graphics Forum. Vol. 34. 5. Wiley Online Library. 2015, pp. 1–12.
Wikimedia commons user DenisBoigelot. Examples of correlations. In the public domain. 2011. https://commons.wikimedia.org/wiki/File:Correlation_examples2.svg
Barbara Di Fabio and Massimo Ferri. “Comparing persistence diagrams through complex vectors”. In: International Conference on Image Analysis and Processing. Springer. 2015, pp. 294–305.
Norwegian Mapping Authority / Statens Kartverk. DTM 10 elevation data. Copyright Statens Kartverk, CC-BY-4.0. https://www.kartverket.no
Michael Kerber, Dmitriy Morozov, and Arnur Nigmetov. “Geometry helps to compare persistence diagrams”. In: Journal of Experimental Algorithmics (JEA) 22 (2017), pp. 1–4.
Genki Kusano, Kenji Fukumizu, and Yasuaki Hiraoka. “Kernel method for persistence diagrams via kernel embedding and weight factor”. In: Journal of Machine Learning Research 18.189 (2018), pp. 1–41.
Tam Le and Makoto Yamada. “Persistence Fisher kernel: A Riemannian manifold kernel for persistence diagrams”. In: Advances in Neural Information Processing Systems. 2018, pp. 10007–10018.
Russell Lyons et al. “Distance covariance in metric spaces”. In: The Annals of Probability 41.5 (2013), pp. 3284–3305.
Daniel Lütgehetmann. Flagser. url https://github.com/luetge/flagser
Mark W Meckes. “Positive definite metric spaces”. In: Positivity 17.3 (2013), pp. 733–757.
Wikimedia commons user Naught101. Examples of correlations. Licensed under CC-BY- SA-3.0. 2012. https://commons.wikimedia.org/wiki/File:Distance_Correlation_Examples.svg
Michael W Reimann et al. “Cliques of neurons bound into cavities provide a missing link between structure and function”. In: Frontiers in computational neuroscience 11 (2017), p. 48.
Jan Reininghaus, Stefan Huber, Ulrich Bauer, and Roland Kwitt. “A stable multi-scale kernel for topological machine learning”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015, pp. 4741–4748.
Shawn J Riley, SD DeGloria, and Robert Elliot. “A Terrain Ruggedness Index that Quantifies Topographic Heterogeneity”. In: Intermountain Journal of Sciences 5.1–4 (1999), pp. 23–27.
Vanessa Robins and Katharine Turner. “Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids”. In: Physica D: Nonlinear Phenomena 334 (2016), pp. 99–117.
Andrew Robinson and Katharine Turner. “Hypothesis testing for topological data analysis”. In: Journal of Applied and Computational Topology 1.2 (2017),pp. 241–261.
Gábor J Székely, Maria L Rizzo, and Nail K Bakirov. “Measuring and testing dependence by correlation of distances”. In: The annals of statistics 35.6 (2007), pp. 2769–2794.
The GUDHI Editorial Board. GUDHI. url http://gudhi.gforge.inria.fr/
Katharine Turner. Means and medians of sets of persistence diagrams. 2013. arXiv: 1307. 8300.
Katharine Turner, Yuriy Mileyko, Sayan Mukherjee, and John Harer. “Fréchet means for distributions of persistence diagrams”. In: Discrete & Computational Geometry 52.1 (2014), pp. 44–70.
C Van Den Berg, Jens Peter Reus Christensen, and Paul Ressel. Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions. Springer Science & Business Media, 2012.
Acknowledgements
G.S. would like to thank Andreas Prebensen Korsnes of the Norwegian Mapping Authority for going out of his way to facilitate bulk downloads of DEM data before a single region was decided upon for the experiment in Sect. 5.2.
G.S. was supported by Swiss National Science Foundation grant number 200021_172636.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Turner, K., Spreemann, G. (2020). Same But Different: Distance Correlations Between Topological Summaries. In: Baas, N., Carlsson, G., Quick, G., Szymik, M., Thaule, M. (eds) Topological Data Analysis. Abel Symposia, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-43408-3_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-43408-3_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43407-6
Online ISBN: 978-3-030-43408-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)