Advertisement

Foundations of Computational Mathematics

, Volume 14, Issue 4, pp 745–789 | Cite as

Robust Statistics, Hypothesis Testing, and Confidence Intervals for Persistent Homology on Metric Measure Spaces

  • Andrew J. Blumberg
  • Itamar Gal
  • Michael A. Mandell
  • Matthew Pancia
Article

Abstract

We study distributions of persistent homology barcodes associated to taking subsamples of a fixed size from metric measure spaces. We show that such distributions provide robust invariants of metric measure spaces and illustrate their use in hypothesis testing and providing confidence intervals for topological data analysis.

Keywords

Persistent homology Stability Robustness Barcode space Bottleneck metric Gromov–Prohorov metric  Hypothesis testing Confidence interval Metric measure space 

Mathematics Subject Classification

55U10 68U05 

Notes

Acknowledgments

The authors would like to thank Gunnar Carlsson and Michael Lesnick for useful comments, Rachel Ward for comments on a prior draft, and Olena Blumberg for help with background and for assistance with the analysis of the tightness of the main theorem. We would also like to thank the Institute for Mathematics and Its Applications for hospitality while revising this paper. The authors were supported in part by Defense Advanced Research Projects Agency (DARPA) Young Faculty Award N66001-10-1-4043.

References

  1. 1.
    R. J. Adler, O. Bobrowski, and S. Weinberger. Crackle: The persistent homology of noise. arXiv:1301.1466, 2013.
  2. 2.
    R. J. Adler, O. Bobrowski, M. S. Borman, E. Subag, and S. Weinberger. Persistent homology for random fields and complexes. Inst. Math. Stat. 6 (2010), 124–143.Google Scholar
  3. 3.
    O. Bobrowski and R. J. Adler. Distance functions, critical points, and topology for some random complexes. arXiv:1107.4775, 2011.
  4. 4.
    A. J. Blumberg and M. A. Mandell. Resampling methods for estimating persistent homology (in preparation).Google Scholar
  5. 5.
    P. Bubenik. Statistical topology using persistence landscapes. arXiv:1207.6437, 2012.
  6. 6.
    P. Bubenik and J. A. Scott. Categorication of persistent homology. arXiv:1205.3669, 2012.
  7. 7.
    P. Bubenik, G. Carlsson, P. T. Kim, and Z.-M. Luo. Statistical topology via Morse theory persistence and nonparametric estimation. In Algebraic methods in statistics and probability II, Contemp. Math., 516. Amer. Math. Soc., Providence, RI, 2010, pp. 75–92.Google Scholar
  8. 8.
    F. Cagliari, M. Ferri and P. Pozzi. Size functions from the categorical viewpoint. Acta Appl. Math. 67 (2001), 225–235.CrossRefzbMATHMathSciNetGoogle Scholar
  9. 9.
    C. Caillerie, F. Chazal, J. Dedecker, and B. Michel. Deconvolution for the Wasserstein metric and geometric inference. Electron. J. Statist. 5 (2011), 1394–1423.CrossRefzbMATHMathSciNetGoogle Scholar
  10. 10.
    G. Carlsson and V. De Silva. Zigzag persistence. Foundations of computational mathematics 10 (2010), 367–405.CrossRefzbMATHMathSciNetGoogle Scholar
  11. 11.
    G. Carlsson and F. Memoli. Characterization, stability, and convergence of hierarchical clustering methods. Journal of machine learning research 11 (2009), 1425–1470.MathSciNetGoogle Scholar
  12. 12.
    G. Carlsson, T. Ishkhanov, V. de Silva., A. Zomorodian. On the local behavior of spaces of natural images. International journal of computer vision 76 (2008), 1–12.CrossRefGoogle Scholar
  13. 13.
    F. Chazal, D. Cohen-Steiner, L.J. Guibas, F. Memoli, S. Oudot. Gromov-Hausdorff stable signatures for shapes using persistence. Comput. Graph. Forum, 28 (2009), 1393–1403.CrossRefGoogle Scholar
  14. 14.
    F. Chazal, D. Cohen-Steiner, and Q. Merigot. Geometric inference for probability measures. Found. Comp. Math. 11 (2011), 733–751.CrossRefzbMATHMathSciNetGoogle Scholar
  15. 15.
    F. Chazal, V. De Silva, M. Glisse, and S. Oudot. The structure and stability of persistence modules. arXiv:1207.3674, 2012.
  16. 16.
    F. Chazal, V. De Silva, and S. Oudot. Persistence stability for geometric complexes. Geometriae Dedicata (2013). doi: 10.1007/s10711-013-9937-z.
  17. 17.
    M. K. Chung, P. Bubenik, and P. T. Kim. Persistence diagrams in cortical surface data. In Information Processing in Medical Imaging (IPMI) 2009, Lecture Notes in Computer Science, Vol. 5636, Springer, New York, 2009, pp. 386–397.Google Scholar
  18. 18.
    D. Cohen-Steiner, H. Edelsbrunner, and J. Harer. Stability of persistence diagrams. Disc. and Comp. Geom., 37 (2007), 103–120.CrossRefzbMATHMathSciNetGoogle Scholar
  19. 19.
    D Cohen-Steiner, H. Edelsbrunner, J. Harer, and Y. Mileyko. Lipschitz functions have \(L_p\)-stable persistence. Foundations of computational mathematics, 10 (2010), 127–139.CrossRefzbMATHMathSciNetGoogle Scholar
  20. 20.
    W. J. Conover. Practical Nonparametric Statistics, 3rd edn. Wiley, New York, 1999.Google Scholar
  21. 21.
    H. A. David and H. N. Nagaraja. Order Statistics. 3rd edition. Wiley, New York, 2003.Google Scholar
  22. 22.
    V. de Silva and G. Carlsson. Topological estimation using witness complexes. Proc. of Symp. on Point-Based Graph. (2004), pp. 157–166.Google Scholar
  23. 23.
    P. Diaconis, S. Holmes, M. Shahshahani. Sampling from a manifold. arXiv:1206.6913, 2011.
  24. 24.
    H. Edelsbrunner and J. Harer. Persistent homology—a survey. In Surveys on Discrete and Computational Geometry. Twenty Years Later, Contemp. Math., 453. Amer. Math. Soc., Providence, RI, 2008, pp. 257–282.Google Scholar
  25. 25.
    H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and simplification. Disc. and Comp. Geom., 28 (2002), 511–533.CrossRefzbMATHMathSciNetGoogle Scholar
  26. 26.
    P. Frosini and C. Landi. Size theory as a topological tool for computer vision. Pattern Recognition and Image Analysis 9 (1999), 596–603.Google Scholar
  27. 27.
    Free Software Foundation. http://www.gnu.org/software/gsl/, 2013
  28. 28.
    S. Gadgil and M. Krishnapur. Lipschitz correspondence between metric measure spaces and random distance matrices. Int. Math. Res. Not., no. 24 (2013), 5623–5644.Google Scholar
  29. 29.
    A. Greven, P. Pfaffelhuber, and A. Winter. Convergence in distribution of random metric measure spaces. Prob. Theo. Rel. Fields, 145 (2009), 285–322.CrossRefzbMATHMathSciNetGoogle Scholar
  30. 30.
    E. Gine, Z. Chen (2004) Another approach to asymptotics and bootstrap of randomly trimmed means. Ann. of the Institute of Stat. Math. 56:771–790CrossRefzbMATHMathSciNetGoogle Scholar
  31. 31.
    J.A. Hartigan. Consistency of singe linkage for high-density clusters. J. Amer. Statist. Assoc., 76 (1981), 388–394.CrossRefzbMATHMathSciNetGoogle Scholar
  32. 32.
    J.A. Hartigan. Statistical theory in clustering. J. Classification, 2 (1985), 63–76.CrossRefzbMATHMathSciNetGoogle Scholar
  33. 33.
    M. Kahle. Random geometric complexes. Disc. and Comp. Geometry, 45 (2011), 553–573.CrossRefzbMATHMathSciNetGoogle Scholar
  34. 34.
    M. Kahle and E. Meckes. Limit theorems for Betti numbers of random simplicial complexes. Homol. Homotopy Appl., 15 (2013), 343–374.Google Scholar
  35. 35.
    J. Latschev. Vietoris-Rips complexes of metric spaces near a closed Riemannian manifold. Archiv der Math., 77 (2001), 522–528.CrossRefzbMATHMathSciNetGoogle Scholar
  36. 36.
    F. Memoli. Gromov-Wasserstein distances and the metric approach to object matching. Foundations of Computational Mathematics 11 (2011), 417–487.CrossRefzbMATHMathSciNetGoogle Scholar
  37. 37.
    Y. Mileyko, S. Mukherjee, and J. Harer. Probability measures on the space of persistence diagrams. Inverse Probl. 27 (2011). doi: 10.1088/0266-5611/27/12/124007.
  38. 38.
  39. 39.
    P. Niyogi, S. Smale, and S. Weinberger. Finding the homology of submanifolds with high confidence from random samples. Disc. and Comp. Geometry, 39 (2008), 419–441.CrossRefzbMATHMathSciNetGoogle Scholar
  40. 40.
    D. Pollard. textitConvergence of Stochastic Processes. Springer, New York, 1984.Google Scholar
  41. 41.
    V. Robins. Toward computing homology from finite approximations. Topology Proceedings 24 (1999), 503–532.zbMATHMathSciNetGoogle Scholar
  42. 42.
    Sturm K-T (2006) On the geometry of metric measure spaces. Acta Mathematica 196:65–131.CrossRefzbMATHMathSciNetGoogle Scholar
  43. 43.
    M. Tsao and J. Zhou. A nonparametric confidence interval for the trimmed mean. J. of Nonparametric Stat. 14 (2002), 665–673.CrossRefzbMATHMathSciNetGoogle Scholar
  44. 44.
    K. Turner, Y. Mileyko, S. Mukherjee, and J. Harer. Frechet means for distributions of persistence diagrams. arXiv:1206.2790.
  45. 45.
    A. V. van der Waart. Asymptotic Statistics. Cambridge University Press, Cambridge, UK, 1998.Google Scholar
  46. 46.
    J.H. van Hateren and A. van der Schaaf. Independent component filters of natural images compared with simple cells in primary visual cortex. Proc. R. Soc. Lond. B, 265 (1998), 359–366.CrossRefGoogle Scholar
  47. 47.
    A. Zomorodian and G. Carlsson. Computing persistent homology. Disc. and Comp. Geometry, 33 (2005), 249–274.CrossRefzbMATHMathSciNetGoogle Scholar

Copyright information

© SFoCM 2014

Authors and Affiliations

  • Andrew J. Blumberg
    • 1
  • Itamar Gal
    • 1
  • Michael A. Mandell
    • 2
  • Matthew Pancia
    • 1
  1. 1.Department of MathematicsUniversity of Texas at AustinAustinUSA
  2. 2.Department of MathematicsIndiana UniversityBloomingtonUSA

Personalised recommendations