Skip to main content
Log in

Extracting Persistent Clusters in Dynamic Data via Möbius Inversion

  • Published:
Discrete & Computational Geometry Aims and scope Submit manuscript

Abstract

Identifying and representing clusters in time-varying network data is of particular importance when studying collective behaviors emerging in nature, in mobile device networks or in social networks. Based on combinatorial, categorical, and persistence theoretic viewpoints, we establish a stable functorial pipeline for the summarization of the evolution of clusters in a time-varying network. We first construct a complete summary of the evolution of clusters in a given time-varying network over a set of entities X of which takes the form of a formigram. This formigram can be understood as a certain Reeb graph \(\mathcal {R}\) which is labeled by subsets of X. By applying Möbius inversion to the formigram in two different manners, we obtain two dual notions of diagram: the maximal group diagram and the persistence clustergram, both of which are in the form of an ‘annotated’ barcode. The maximal group diagram consists of time intervals annotated by their corresponding maximal groups — a notion due to Buchin et al., implying that we recognize the notion of maximal groups as a special instance of generalized persistence diagram by Patel. On the other hand, the persistence clustergram is mostly obtained by annotating the intervals in the zigzag barcode of the Reeb graph \(\mathcal {R}\) with certain merging/disbanding events in the given time-varying network. We show that both diagrams are complete invariants of formigrams (or equivalently of trajectory grouping structure by Buchin et al.) and thus contain more information than the Reeb graph \(\mathcal {R}\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Code Availability

In [52], a collection of dynamic graphs that naturally arise from an artificial life program, called Boids [71], were successfully classified by the bottleneck distance on their zigzag barcodes (cf. Fig. 1 (F)). An implementation is available in [27]. A demo is available in [53].

Data Availability

Data sharing not applicable to this article.

Notes

  1. The name formigram is a combination of the words formicarium and diagram. A formicarium or ant farm is an enclosure for keeping ants under semi-natural conditions [86]. Visually, a formigram is reminiscent of a formicarium (cf. Fig. 1 (B)).

  2. The function c associated to F is not unique and thus we refer to \(\textrm{im}(c)\) as ‘a’ set of critical points and not as ‘the’ set of critical points.

  3. F([tt]) is said to be the costalk of F at t in the literature (e.g. [13, 29]).

  4. In [78], this type of DGs was utilized for topological characterization of insect swarms. A sensor network [31, 33] is another example of such DGs arising from viewing each sensor as a point in the dynamic metric space of sensors.

  5. For instance, reflexivity follows since is 0-tripod between \(\mathcal {G}_X\) and \(\mathcal {G}_X\).

  6. Silhouettes in this paper have no relation with the persistence silhouettes in [24].

  7. In this section groups do not stand for groups in abstract algebra. See Remark 6.18.

  8. In [52, Sect. 5], DMSs generated through the Boids model [71] were successfully classified by the bottleneck distance between their zigzag barcodes (cf. Fig. 1 (F)).

  9. These are formigrams by Proposition 7.5 and Definition 3.10.

  10. In the preprint of this work, we also considered formigrams derived from directed dynamic graphs [56].

References

  1. Adams, H., Ghosh, D., Mask, C., Ott, W., Williams, K.: Efficient evader detection in mobile sensor networks. arXiv preprint arXiv:2101.09813 (2021)

  2. Adams, H., Carlsson, G.: Evasion paths in mobile sensor networks. Int. J. Robot. Res. 34(1), 90–104 (2015)

    Google Scholar 

  3. Azumaya, G., et al.: Corrections and supplementaries to my paper concerning Krull-Remak-Schmidt’s theorem. Nagoya Math. J. 1, 117–124 (1950)

    MathSciNet  Google Scholar 

  4. Bauer, U., Ge, X., Wang, Y.: Measuring distance between Reeb graphs. In: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, pp. 464–473 (2014)

  5. Bauer, U., Munch, E., Wang, Y.: Strong Equivalence of the Interleaving and Functional Distortion Metrics for Reeb Graphs. In: Arge, L., Pach, J. (eds.) 31st International Symposium on Computational Geometry (SoCG 2015). Leibniz International Proceedings in Informatics (LIPIcs), vol. 34, pp. 461–475. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015). https://doi.org/10.4230/LIPIcs.SOCG.2015.461. http://drops.dagstuhl.de/opus/volltexte/2015/5146

  6. Bauer, U., Lesnick, M.: Induced matchings and the algebraic stability of persistence barcodes. J. Comput. Geom. 6(2), 162–191 (2015)

    MathSciNet  Google Scholar 

  7. Benkert, M., Gudmundsson, J., Hübner, F., Wolle, T.: Reporting flock patterns. Comput. Geom. 41(3), 111–125 (2008)

    MathSciNet  Google Scholar 

  8. Birkhoff, G.: Lattice Theory, vol. 25. American Mathematical Society, Providence (1948)

    Google Scholar 

  9. Bjerkevik, H.B.: On the stability of interval decomposable persistence modules. Discrete Comput. Geom. 66(1), 92–121 (2021)

    MathSciNet  Google Scholar 

  10. Bondy, J., Murty, U.: Graph Theory (Graduate Texts in Mathematics). Springer, New York (2008)

    Google Scholar 

  11. Botnan, M.B.: Interval decomposition of infinite zigzag persistence modules. Proc. Am. Math. Soc. 145(8), 3571–3577 (2017)

    MathSciNet  Google Scholar 

  12. Botnan, M., Lesnick, M.: Algebraic stability of zigzag persistence modules. Algebraic Geom. Topol. 18(6), 3133–3204 (2018)

    MathSciNet  Google Scholar 

  13. Bredon, G.E.: Sheaf Theory, vol. 170. Springer, New York (2012)

    Google Scholar 

  14. Bubenik, P., Scott, J.A.: Categorification of persistent homology. Discrete Comput. Geom. 51(3), 600–627 (2014)

    MathSciNet  Google Scholar 

  15. Buchin, K., Buchin, M., van Kreveld, M.J., Speckmann, B., Staals, F.: Trajectory grouping structure. JoCG 6(1), 75–98 (2015)

    MathSciNet  Google Scholar 

  16. Burago, D., Burago, Y., Ivanov, S.: A Course in Metric Geometry. AMS Graduate Studies in Math., vol. 33. American Mathematical Society, Providence (2001)

    Google Scholar 

  17. Carlsson, G., Mémoli, F.: Multiparameter hierarchical clustering methods. In: Classification as a Tool for Research, pp. 63–70. Springer (2010)

  18. Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46(2), 255–308 (2009)

    MathSciNet  Google Scholar 

  19. Carlsson, G., Mémoli, F.: Characterization, stability and convergence of hierarchical clustering methods. J. Mach. Learn. Res. 11, 1425–1470 (2010)

    MathSciNet  Google Scholar 

  20. Carlsson, G., Mémoli, F.: Classifying clustering schemes. Found. Comput. Math. 13(2), 221–252 (2013)

    MathSciNet  Google Scholar 

  21. Carlsson, G., De Silva, V.: Zigzag persistence. Found. Comput. Math. 10(4), 367–405 (2010)

    MathSciNet  Google Scholar 

  22. Carlsson, G., Zomorodian, A.: The theory of multidimensional persistence. Discrete Comput. Geom. 42(1), 71–93 (2009)

    MathSciNet  Google Scholar 

  23. Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L.J., Oudot, S.: Proximity of persistence modules and their diagrams. In: Proceedings of 25th ACM Symposium on Computational Geometry, pp. 237–246 (2009)

  24. Chazal, F., Fasy, B.T., Lecci, F., Rinaldo, A., Wasserman, L.: Stochastic convergence of persistence landscapes and silhouettes. In: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, pp. 474–483 (2014)

  25. Chowdhury, S., Mémoli, F.: Explicit geodesics in Gromov-Hausdorff space. Electron. Res. Announc. 25, 48–59 (2018)

    MathSciNet  Google Scholar 

  26. Clause, N., Kim, W.: Spatiotemporal Persistent Homology Computation Tool. https://github.com/ndag/PHoDMSs (2020)

  27. Clause, N.: Zigzag Persistent Homology and Dynamic Networks. https://github.com/ndag/DynGraphZZ (2021)

  28. Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discrete Comput. Geom. 37(1), 103–120 (2007)

    MathSciNet  Google Scholar 

  29. Curry, J.M.: Sheaves, cosheaves and applications. PhD thesis, University of Pennsylvania (2014)

  30. Curry, J., Patel, A.: Classification of constructible cosheaves. Theory Appl. Categories 35(27), 1012–1047 (2020)

    MathSciNet  Google Scholar 

  31. De Silva, V., Ghrist, R.: Coordinate-free coverage in sensor networks with controlled boundaries via homology. Int. J. Robot. Res. 25(12), 1205–1222 (2006)

    Google Scholar 

  32. De Silva, V., Munch, E., Patel, A.: Categorified Reeb graphs. Discrete Comput. Geom. 55(4), 854–906 (2016)

    MathSciNet  Google Scholar 

  33. De Silva, V., Ghrist, R., et al.: Homological sensor networks. Notices of the American mathematical society 54(1) (2007)

  34. Dey, T.K., Hou, T.: Computing zigzag persistence on graphs in near-linear time. In: 37th International Symposium on Computational Geometry (SoCG 2021). Leibniz International Proceedings in Informatics (LIPIcs), vol. 189, pp. 30–13015. Schloss Dagstuhl—Leibniz-Zentrum für Informatik, (2021). https://doi.org/10.4230/LIPIcs.SoCG.2021.30

  35. Dey, T.K., Hou, T.: Updating zigzag persistence and maintaining representatives over changing filtrations. arXiv preprint arXiv:2112.02352 (2021)

  36. Di Fabio, B., Landi, C.: The edit distance for Reeb graphs of surfaces. Discrete Comput. Geom. 55(2), 423–461 (2016)

    MathSciNet  Google Scholar 

  37. Edelsbrunner, H., Harer, J.: Persistent homology—a survey. Contemp. Math. 453, 257–282 (2008)

    MathSciNet  Google Scholar 

  38. Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. Discrete Comput. Geom. 28, 511–533 (2002)

    MathSciNet  Google Scholar 

  39. Gabriel, P.: Unzerlegbare darstellungen i. Manuscr. Math. 6(1), 71–103 (1972)

    MathSciNet  Google Scholar 

  40. Gamble, J., Chintakunta, H., Krim, H.: Applied topology in static and dynamic sensor networks. In: 2012 International Conference on Signal Processing and Communications (SPCOM), pp. 1–5 (2012). IEEE

  41. Ghrist, R., Riess, H.: Cellular sheaves of lattices and the Tarski Laplacian. Homol. Homotopys. Appl. 24(1), 325–345 (2022)

    MathSciNet  Google Scholar 

  42. Gonzalez-Diaz, R., Jimenez, M.-J., Medrano, B.: Spatiotemporal barcodes for image sequence analysis. In: International Workshop on Combinatorial Image Analysis, pp. 61–70 (2015). Springer

  43. Gudmundsson, J., van Kreveld, M.: Computing longest duration flocks in trajectory data. In: Proceedings of the 14th Annual ACM International Symposium on Advances in Geographic Information Systems, pp. 35–42 (2006). ACM

  44. Gudmundsson, J., van Kreveld, M., Speckmann, B.: Efficient detection of patterns in 2d trajectories of moving points. Geoinformatica 11(2), 195–215 (2007)

    Google Scholar 

  45. Hajij, M., Wang, B., Scheidegger, C., Rosen, P.: Visual detection of structural changes in time-varying graphs using persistent homology, 125–134 (2018). IEEE

  46. Huang, Y., Chen, C., Dong, P.: Modeling herds and their evolvements from trajectory data. In: International Conference on Geographic Information Science, pp. 90–105 (2008). Springer

  47. Hwang, S.-Y., Liu, Y.-H., Chiu, J.-K., Lim, E.-P.: Mining mobile group patterns: A trajectory-based approach. In: PAKDD, vol. 3518, pp. 713–718 (2005). Springer

  48. Jardine, N., Sibson, R.: Mathematical Taxonomy, p. 286. Wiley, London (1971). (Wiley Series in Probability and Mathematical Statistics)

    Google Scholar 

  49. Jeung, H., Yiu, M.L., Zhou, X., Jensen, C.S., Shen, H.T.: Discovery of convoys in trajectory databases. Proc. VLDB Endow. 1(1), 1068–1080 (2008)

    Google Scholar 

  50. Kalnis, P., Mamoulis, N., Bakiras, S.: On discovering moving clusters in spatio-temporal data. In: SSTD, vol. 3633, pp. 364–381 (2005). Springer

  51. Kerber, M., Morozov, D., Nigmetov, A.: Geometry Helps to Compare Persistence Diagrams. ACM, New York (2017)

    Google Scholar 

  52. Kim, W., Mémoli, F., Smith, Z.: Analysis of dynamic graphs and dynamic metric spaces via zigzag persistence. In: Topological Data Analysis, pp. 371–389. Springer, (2020)

  53. Kim, W., Mémoli, F., Smith, Z.: Clustering behavior summary of dynamic metric data (2017). https://research.math.osu.edu/networks/formigrams

  54. Kim, W., Mémoli, F., Stefanou, A.: Interleaving by parts for persistence in a poset. arXiv preprint arXiv:1912.04366 (2019)

  55. Kim, W., Mémoli, F.: Formigrams: Clustering summaries of dynamic data. In: Proceedings of 30th Canadian Conference on Computational Geometry (CCCG18) (2018)

  56. Kim, W., Memoli, F.: Stable signatures for dynamic graphs and dynamic metric spaces via zigzag persistence. arXiv preprint arXiv:1712.04064v4 (2017)

  57. Kim, W., Mémoli, F.: Generalized persistence diagrams for persistence modules over posets. J. Appl. Comput. Topol. 5(4), 533–581 (2021)

    MathSciNet  Google Scholar 

  58. Kim, W., Mémoli, F.: Spatiotemporal persistent homology for dynamic metric spaces. Discrete Comput. Geom. 66(3), 831–875 (2021)

    MathSciNet  Google Scholar 

  59. Li, Z., Ding, B., Han, J., Kays, R.: Swarm: mining relaxed temporal moving object clusters. Proc. VLDB Endow. 3(1–2), 723–734 (2010)

    Google Scholar 

  60. Mac Lane, S.: Categories for the Working Mathematician, vol. 5. Springer, New York (2013)

    Google Scholar 

  61. McCleary, A., Patel, A.: Bottleneck stability for generalized persistence diagrams. Proc. Am. Math. Soc. U.S.A. 148(7), 3149–3161 (2020)

    MathSciNet  Google Scholar 

  62. McCleary, A., Patel, A.: Edit distance and persistence diagrams over lattices. SIAM J. Appl. Algebra Geom. 6(2), 134–155 (2022)

    MathSciNet  Google Scholar 

  63. Mémoli, F.: A distance between filtered spaces via tripods. arXiv preprint arXiv:1704.03965 (2017)

  64. Mitchell, B.: Theory of Categories, vol. 17. Academic Press, Washington, DC (1965)

    Google Scholar 

  65. Morozov, D., Beketayev, K., Weber, G.: Interleaving distance between merge trees. Discrete Comput. Geom. 49, 22–45 (2013)

    MathSciNet  Google Scholar 

  66. Munch, E.: Applications of persistent homology to time varying systems. PhD thesis, Duke University (2013)

  67. Parrish, J.K., Hamner, W.M.: Animal Groups in Three Dimensions: How Species Aggregate. Cambridge University Press, Cambridge (1997)

    Google Scholar 

  68. Patel, A.: Reeb spaces and the robustness of preimages. PhD thesis, Duke University (2010)

  69. Patel, A.: Generalized persistence diagrams. J. Appl. Comput. Topol. 1(3), 397–419 (2018)

    MathSciNet  Google Scholar 

  70. Puuska, V.: Erosion distance for generalized persistence modules. Homol. Homotopye Appl. 22(1), 233–254 (2020)

    MathSciNet  Google Scholar 

  71. Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. ACM SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)

    Google Scholar 

  72. Rolle, A., Scoccola, L.: Stable and consistent density-based clustering. arXiv preprint arXiv:2005.09048 (2020)

  73. Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. ACM Comput. Surv. 51(2), 1–37 (2018)

    Google Scholar 

  74. Rota, G.-C.: On the foundations of combinatorial theory i. theory of Möbius functions. Probab. Theory Relat. Fields 2(4), 340–368 (1964)

    Google Scholar 

  75. Rubenstein, M., Ahler, C., Nagpal, R.: Kilobot: A low cost scalable robot system for collective behaviors. In: 2012 IEEE International Conference on Robotics and Automation, pp. 3293–3298 (2012). IEEE

  76. Schmiedl, F.: Shape matching and mesh segmentation. PhD thesis, Technische Universität München (2015)

  77. Schmiedl, F.: Computational aspects of the Gromov-Hausdorff distance and its application in non-rigid shape matching. Discrete Comput. Geom. 57(4), 854–880 (2017)

    MathSciNet  Google Scholar 

  78. Sinhuber, M., Ouellette, N.T.: Phase coexistence in insect swarms. Phys. Rev. Lett. 119(17), 178003 (2017)

    Google Scholar 

  79. Sumpter, D.J.: Collective Animal Behavior. Princeton University Press, Princeton (2010)

    Google Scholar 

  80. Topaz, C.M., Ziegelmeier, L., Halverson, T.: Topological data analysis of biological aggregation models. PLoS ONE 10(5), 0126383 (2015)

    Google Scholar 

  81. van Goethem, A., van Kreveld, M., Löffler, M., Speckmann, B., Staals, F.: Grouping Time-Varying Data for Interactive Exploration. In: 32nd International Symposium on Computational Geometry (SoCG 2016). Leibniz International Proceedings in Informatics (LIPIcs), vol. 51, pp. 61–16116. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2016). https://doi.org/10.4230/LIPIcs.SoCG.2016.61.http://drops.dagstuhl.de/opus/volltexte/2016/5953

  82. van Kreveld, M., Löffler, M., Staals, F., Wiratma, L.: A refined definition for groups of moving entities and its computation. Int. J. Comput. Geom. Appl. 28(02), 181–196 (2018)

    MathSciNet  Google Scholar 

  83. Vehlow, C., Beck, F., Auwärter, P., Weiskopf, D.: Visualizing the evolution of communities in dynamic graphs. Comput. Graph. Forum 34(1), 277–288 (2015). (Wiley Online Library)

    Google Scholar 

  84. Vieira, M.R., Bakalov, P., Tsotras, V.J.: On-line discovery of flock patterns in spatio-temporal data. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 286–295 (2009). ACM

  85. Wang, Y., Lim, E.-P., Hwang, S.-Y.: Efficient algorithms for mining maximal valid groups. VLDB J. 17(3), 515–535 (2008)

    Google Scholar 

  86. Wikipedia: Formicarium—Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Formicarium. Accessed 12 Dec 2021

  87. Wiratma, L., van Kreveld, M., Löffler, M., Staals, F.: An experimental evaluation of grouping definitions for moving entities. In: Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 89–98 (2019)

  88. Xian, L., Adams, H., Topaz, C.M., Ziegelmeier, L.: Capturing dynamics of time-varying data via topology. Found. Data Sci. 4(1), 1–36 (2022)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was partially supported by NSF grants IIS-1422400, CCF-1526513, DMS-1723003, and CCF-1740761. We thank Michael Lesnick for useful comments about the paper. We also thank Zane Smith for providing an example of non-planar formigram in Example 3.14. WK thanks Amit Patel for beneficial discussions regarding topics related to Sect. 6.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Woojin Kim.

Ethics declarations

Conflict of interest/Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Editor in Charge: Kenneth Clarkson

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Details and Proofs

1.1 A.1 Bottleneck distance

Recall that injective partial functions are referred to as matchings. We use \(\sigma :A\nrightarrow B\) to denote a matching \(\sigma \subset A\times B\) between sets A and B. The canonical projections of \(\sigma \) onto A and B are denoted by \(\textrm{coim} (\sigma )\) and \(\textrm{im} (\sigma )\), respectively.

Many equivalent expressions for the bottleneck distance have been given in the TDA literature. We adopt the following form from [6]: Recall Notation 2.12. Letting \({\mathcal {A}}\) be a multiset of intervals in \(\textbf{R}\) and \(\varepsilon \ge 0\),

$$\begin{aligned} {\mathcal {A}}^\varepsilon :=\left\{ \langle b,d\rangle \in {\mathcal {A}}:b+\varepsilon <d \right\} =\left\{ I\in {\mathcal {A}}:[t,t+\varepsilon ]\subset I\ \text{ for } \text{ some }\ t\in \textbf{R}\right\} . \end{aligned}$$

Note that \({\mathcal {A}}^0={\mathcal {A}}\).

Definition A.1

([6]) Let \({\mathcal {A}}\) and \({\mathcal {B}}\) be multisets of intervals in \(\textbf{R}\). We define a \(\delta \)-matching between \({\mathcal {A}}\) and \({\mathcal {B}}\) to be a matching \(\sigma :{\mathcal {A}}\nrightarrow {\mathcal {B}}\) such that \({\mathcal {A}}^{2\delta }\subset \textrm{coim} (\sigma )\), \({\mathcal {B}}^{2\delta }\subset \textrm{im} (\sigma )\), and if \(\sigma \langle b,d\rangle =\langle b',d'\rangle \), then

$$\begin{aligned}\langle b,d\rangle \subset \langle b'-\delta , d'+\delta \rangle ,\hspace{5mm} \langle b',d'\rangle \subset \langle b-\delta , d+\delta \rangle . \end{aligned}$$

with the convention \(+\infty +\delta =+\infty \) and \(-\infty -\delta =-\infty \). We define the bottleneck distance \(d_\textrm{B}\) by

$$\begin{aligned} d_\textrm{B}({\mathcal {A}},{\mathcal {B}}):=\inf \{\delta \in [0,\infty ): \exists \delta -\text{ matching } \text{ between } {\mathcal {A}} \text{ and } {\mathcal {B}}\}. \end{aligned}$$

We declare \(d_\textrm{B}({\mathcal {A}},{\mathcal {B}})=+\infty \) when there is no \(\delta \)-matching between \({\mathcal {A}}\) and \({\mathcal {B}}\) for any \(\delta \in [0,\infty )\).

1.2 A.2 Proof of Theorem 4.7

We recall the Gromov-Hausdorff distance between metric spaces. Let \((X,d_X)\) and \((Y,d_Y)\) be any two metric spaces and let be a tripod between X and Y. Then, the distortion of R is defined as

$$\begin{aligned} \displaystyle \textrm{dis}(R):=\sup _{\begin{array}{c} z,z'\in Z \end{array}}\left|d_X\left( \varphi _X(z),\varphi _X(z')\right) -d_Y\left( \varphi _Y(z),\varphi _Y(z')\right) \right|. \end{aligned}$$

Definition A.2

(Gromov-Hausdorff distance [16, Sect. 7.3]) Let \((X,d_X)\) and \((Y,d_Y)\) be any two compact metric spaces. Then,

$$\begin{aligned} d_\textrm{GH}\left( (X,d_X),(Y,d_Y)\right) =\frac{1}{2}\inf _R\ \textrm{dis}(R) \end{aligned}$$

where the infimum is taken over all tripods R between X and Y. In particular, any tripod R between X and Y is said to be an \(\varepsilon \)-tripod between \((X,d_X)\) and \((Y,d_Y)\) if \(\textrm{dis}(R)\le \varepsilon \).

Proposition A.3

Let \((X,d_X)\) and \((Y,d_Y)\) be any two finite metric spaces. Then, there exist two DGs \(\mathcal {G}_X=\left( V_X(\cdot ),E_X(\cdot )\right) \) and \(\mathcal {G}_Y=\left( V_Y(\cdot ),E_Y(\cdot )\right) \) corresponding to \((X,d_X)\) and \((Y,d_Y)\) respectively such that

$$\begin{aligned} d_\textrm{I}^{\textrm{dynG}}(\mathcal {G}_X,\mathcal {G}_Y)=2\cdot d_\textrm{GH}\left( (X,d_X),(Y,d_Y)\right) . \end{aligned}$$

Proof

Let \(D\ge 0\) be the diameter of \((X,d_X)\). For \(t\in \textbf{R}\), we define:

$$\begin{aligned}{} & {} V_X(t):={\left\{ \begin{array}{ll} \emptyset ,&{}t\notin [0,D]\\ X,&{}t\in [0,D], \end{array}\right. }\\ {}{} & {} E_X(t):={\left\{ \begin{array}{ll} \emptyset ,&{}t\notin [0,D]\\ \{\{x,x'\}\subset X: x\ne x' \text{ and } d_X(x,x')\le t\},&{}t\in [0,D]. \end{array}\right. } \end{aligned}$$

We define \(\mathcal {G}_X\) by \(t\mapsto (V_X(t),E_X(t))\). Define \(\mathcal {G}_Y\) similarly. We show that \(d_\textrm{I}^{\textrm{dynG}}(\mathcal {G}_X,\mathcal {G}_Y)\ge 2\cdot d_\textrm{GH}\left( (X,d_X),(Y,d_Y)\right) .\) To this end, suppose that for some \(\varepsilon \ge 0\), is any \(\varepsilon \)-tripod between \(\mathcal {G}_X\) and \(\mathcal {G}_Y\) (cf. Definition 4.4). Then, by the construction of \(\mathcal {G}_X,\mathcal {G}_Y\), it must hold that \(\left|d_X\left( \varphi _X(z),\varphi _X(z')\right) -d_Y\left( \varphi _Y(z),\varphi _Y(z')\right) \right|\le \varepsilon \) for all \(z,z'\in Z.\) The other inequality \(d_\textrm{I}^{\textrm{dynG}}(\mathcal {G}_X,\mathcal {G}_Y)\le 2\cdot d_\textrm{GH}\left( (X,d_X),(Y,d_Y)\right) \) can be similarly proved. \(\square \)

Definition A.4

An ultrametric space is a metric space (Xd) in which the following ultra-triangle inequality holds: for all \(x,y,z\in X\),

$$\begin{aligned}d(x,z)\le \max \left\{ d(x,y),d(y,z)\right\} .\end{aligned}$$

If (Xd) were a pseudometric, then d is called an ultra-pseudometric.

Proof of Theorem 4.7

Pick any two ultrametric spaces \((X,u_X)\) and \((Y,u_Y)\). Then, by Proposition A.3, there exist DGs \(\mathcal {G}_X=\left( V_X(\cdot ),E_X(\cdot )\right) \) and \(\mathcal {G}_Y=\left( V_Y(\cdot ),E_Y(\cdot )\right) \) such that the interleaving distance between \(\mathcal {G}_X\) and \(\mathcal {G}_Y\) is identical to twice the Gromov-Hausdorff distance \(\Delta :=d_\textrm{GH}((X,u_X),(Y,u_Y))\) between \((X,u_X)\) and \((Y,u_Y)\). However, according to [77, Cor. 3.8], \(\Delta \) cannot be approximated within any factor less than 3 in polynomial time, unless \(P=NP\). The author shows this by observing that any instance of the 3-partition problem can be reduced to an instance of the bottleneck \(\infty \)-Gromov-Hausdorff distance (\(\infty \)-BGHD) problem between ultrametric spaces (cf. [77, p. 865]). The proof follows. \(\square \)

1.3 A.3 Details about Remark 4.18

Remark A.5

(Interleaving between dendrograms) When \(\theta _X,\theta _Y\) are dendrograms over sets X and Y respectively, let be an \(\varepsilon \)-tripod between \(\theta _X\) and \(\theta _Y\). Since both \(\theta _X\) and \(\theta _Y\) get coarser as \(t\in \textbf{R}\) increases, the interleaving condition in Definition 4.11 can be rewritten as follows: for all \(t\in \textbf{R}\) it holds that \(\theta _X(t) \le _{R} \theta _Y(t+\varepsilon )\) and \(\theta _Y(t) \le _{R} \theta _X(t+\varepsilon )\) (cf. Definition 4.10).

Let X be a finite set and let \(\theta _X\) be a dendrogram over X (cf. Remark 3.7). Recall from [19] that this \(\theta _X\) induces a canonical ultra-pseudometric \(u_X:X\times X\rightarrow \textbf{R}_+\) on X (cf. Definition A.4) defined by

$$\begin{aligned} u_X(x,x'):=\inf \{\varepsilon \ge 0: x,x' \text{ belong } \text{ to } \text{ the } \text{ same } \text{ block } \text{ of } \theta _X(\varepsilon )\} \end{aligned}$$
(A1)

Proposition A.6

Given any two dendrograms \(\theta _X,\theta _Y\) over sets XY, respectively, let \(u_X,u_Y\) be the canonical ultra-pseudometrics on X and Y, respectively. Then, \(d_{\textrm{I}}^\textrm{F}(\theta _X,\theta _Y)=2\ d_{\textrm{GH}}((X,u_X), (Y,u_Y)).\)

Proof

We first show that the LHS \(\ge \) the RHS. Let \(\varepsilon \ge 0\) and let be any \(\varepsilon \)-tripod between the two dendrograms \(\theta _X\) and \(\theta _Y\). Let \((x,y),(x',y')\in R\) and let \(t:=u_X(x,x')\). This implies that \(x,x'\) belong to the same block of the partition \(\theta _X(t).\) Since \(\theta _X(t)\le _R \bigvee _{[t]^\varepsilon }\theta _Y=\theta _Y(t+\varepsilon )\), y and \(y'\) must belong to the same block of \(\theta _Y(t+\varepsilon )\), and in turn this implies that \(u_Y(y,y')\le t+\varepsilon =u_X(x,x')+\varepsilon \). By symmetry, we also have \(u_Y(y,y')\le u_X(x,x')+\varepsilon \) and in turn \(\left|u_X(x,x')-u_Y(y,y')\right|\le \varepsilon \). By Definition A.2, this implies that \(d_\textrm{GH}((X,u_X),(Y,u_Y))\le \varepsilon /2.\)

Next, we prove the opposite inequality. Let be a tripod between X and Y such that \(\textrm{dis}(R)=\varepsilon .\) it suffices to show that for all \(t\in \textbf{R}\), \(\theta _X(t)\le _R\theta _Y(t+\varepsilon )\) and \(\theta _Y(t)\le _{R}\theta _X(t+\varepsilon )\). By symmetry, we only prove that \(\theta _X(t)\le _R\theta _Y(t+\varepsilon )\) for all \(t\in \textbf{R}\). For \(t<0\), since \(\theta _X(t)=\emptyset \), \(\theta _X(t)\le _R\theta _Y(t+\varepsilon )\) trivially holds. Now pick any \(t\ge 0\) and pick any \((x,y),(x',y')\in R\). Assume that \(x,x'\) belong to the same block of \(\theta _X(t),\) implying that \(u_X(x,x')\le t.\) Since \(\left|u_X(x,x')-u_Y(y,y')\right|\le \varepsilon \), we know \(u_Y(y,y')\le t+\varepsilon ,\) and hence \(y,y'\) belong to the same block of \(\theta _Y(t+\varepsilon )\). Therefore, \(\theta _X(t)\le _R\theta _Y(t+\varepsilon )\) for all \(t\in \textbf{R}\). \(\square \)

Theorem A.7

(Complexity of computing \(d_{\textrm{I}}^\textrm{F}\)) Fix \(\rho \in (1,6)\). It is not possible to obtain a \(\rho \) approximation to the distance \(d_{\textrm{I}}^\textrm{F}(\theta _X,\theta _Y)\) between formigrams in time polynomially depending on \(\left|X\right|,\left|Y\right|,\left|\textrm{crit}(\theta _X)\right|\), \(\left|\textrm{crit}(\theta _Y)\right|\) unless \(P=NP\).

Proof

Pick any two dendrograms and invoke Proposition A.6 to reduce the problem to the computation of the Gromov-Hausdorff distance between the ultra-pseudometric spaces associated to the dendrograms. The rest of the proof follows along the same lines as that of Theorem 4.7. \(\square \)

1.4 A.4 Proof of Theorem 6.30

Theorem 6.30 directly follows from Theorem A.9 below.

We explicitly represent the colimit of \(M:\textbf{ZZ}\rightarrow \textbf{set}\) as follows. For \(k,l\in \textbf{ZZ}\), assume that \(x\in M{(k)}\) and \(y\in M{(l)}\). We write \(x\sim y\) if k and l are comparable and one of x and y is mapped to the other via the internal map between M(k) and M(l). The colimit of M is the pair \(\left( C,(i_k)_{k\in \textbf{ZZ}}\right) \) described as follows:

$$\begin{aligned} C:=\left( \coprod _{k\in \textbf{ZZ}}M(k)\right) \big /\approx , \end{aligned}$$
(A2)

where \(\approx \) is the equivalence relation generated by the relations \(x_k\sim x_{l}\) for \(x_k\in M(k)\) and \(x_{l}\in M(l)\) with kl being comparable. For the quotient map \(q:\coprod _{k\in \textbf{ZZ}}M(k)\rightarrow C\), each \(i_k\) is the composition \(M_k\hookrightarrow \coprod _{k\in \textbf{ZZ}}M(k) {\mathop {\rightarrow }\limits ^{q}} C\).

Let \(I\in \textbf{Int}(\textbf{ZZ})\). For any functor \(N:I\rightarrow \textbf{set}\), we can construct the limit and colimit of N in the same way; namely, in the above description, replace M and \(\textbf{ZZ}\) by N and I, respectively. In what follows, we use this explicit construction whenever considering colimits of (interval restrictions of) \(\textbf{ZZ}\)-indexed \(\textbf{set}\)-diagrams.

Definition A.8

Let \(I\in \textbf{Int}(\textbf{ZZ})\) and let \(N:I\rightarrow \textbf{set}\) by any functor. Let \(c\in \varinjlim N\). We define the support of c as

$$\begin{aligned}\textrm{supp}(c):=\{k\in I: \exists x_k\in N_k, \ i_k(x_k)=c\}.\end{aligned}$$

In particular, if \(\textrm{supp}(c)=I\), we call c a full component of the functor N.

Given \(M:\textbf{ZZ}\rightarrow \textbf{set}\) and \(I\in \textbf{Int}(\textbf{ZZ})\), we denote the number of full components of \(M\vert _I\) by \(\textrm{full}(M\vert _I)\). Recall Notation 6.4.

Theorem A.9

([57, Cor. 4.10]) For any functor \(M:\textbf{ZZ}\rightarrow \textbf{set}\), the multiplicity of I in \(\textrm{barc}(\mathcal {F}\circ M)\) is

$$\begin{aligned}\textrm{full}(M\vert _I)-\textrm{full}(M\vert _{I^+})-\textrm{full}(M\vert _{I^-})+\textrm{full}(M\vert _{I^\pm }).\end{aligned}$$

Proof of Theorem 6.30

By Proposition 6.10, for every \(I\in \textbf{Int}(\textbf{ZZ})\),

Therefore, by Theorem A.9, it suffices to show that for all \(J\in \textbf{Int}(\textbf{ZZ})\). If \(J\in \textbf{Int}(\textbf{ZZ})\) is not a subset of \(\textrm{supp}(\theta _X)\), then clearly . Now assume that \(J\in \textbf{Int}(\textbf{ZZ})\) is contained in \(\textrm{supp}(\theta _X)\). Then, since \(\theta _X\) is saturated, . Also, \(\left|\bigvee _J\theta _X\right|\) is equal to the number of full components of \(\textrm{Reeb}(\theta _X)\vert _J\), completing the proof. \(\square \)

1.5 A.5 From unlabeled formigrams to persistent cluster counting functors

Let \(\theta _X\) be a formigram. We shall prove that the persistent counting functor (cf. Definition 6.33) can be obtained from the unlabeled formigram of \(\theta _X\) (cf. Definition 3.12 (i)).

Proposition A.10

Let \(\theta \) be the unlabeled formigram of \(\theta _X\). For any \(I\in \textbf{Int}(\textbf{ZZ})\), consider the canonical limit-to-colimit morphism \(\varphi _I:\varprojlim \theta \vert _I\rightarrow \varinjlim \theta \vert _I\) in the category \(\textbf{Part}\). Then, .

Proof

By Proposition 5.1, \(\varprojlim \theta \vert _I\cong \bigwedge _I\theta _X\) and \(\varinjlim \theta \vert _I\cong \bigvee _I\theta _X\), and the morphism \(\varphi _I\) in \(\textbf{Part}\) is the inclusion \(\bigcap _{t\in I}\bigcup \theta _X(t)\hookrightarrow \bigcup _{t\in I}\bigcup \theta _X(t)\). Now by Proposition 5.8 (ii) the desired isomorphism follows. \(\square \)

Proposition A.10 implies that we can extract from \(\theta \): Namely, equals the number of blocks in the second entry of \(\textrm{coim}(\varphi _I)\). Reciprocally, one may wonder whether contains enough information to reconstruct \(\theta \). That is not true; there exists a pair of formigrams which have identical persistent cluster functor, whereas their underlying weighted/unweighted Reeb graphs are different. This implies that their unlabeled formigrams are also different.

1.6 A.6 Details from Section 7

1.6.1 A.6.1 Details from Section 7.2

Proof of Proposition 7.5

Clearly, \(\mathcal {R}_\delta ^1(\gamma _X)\) is a function \(\textbf{R}\rightarrow \textrm{Graph}(X)\). We show that \(\mathcal {R}_\delta ^1(\gamma _X)\) is cosheaf-inducing (cf. Definition 2.17). First we prove that locally \(\mathcal {R}_\delta ^1(\gamma _X)\) admits only finitely many points of discontinuity (those points are called critical points). Let \(I\subset \textbf{R}\) be any nonempty finite interval. For \(i,j\in X:=\{1,\ldots ,n\}\), let \(f_{i,j}:=d_X(\cdot )(i,j):\textbf{R}\rightarrow \textbf{R}_+\). Note that discontinuity points of \(\mathcal {R}_\delta ^1(\gamma _X)\) can occur only at endpoints of connected components of the set \({f_{i,j}}^{-1}(\delta )\) for some \(i,j\in X\). Fix any \(i,j\in X\). Then, by Definition 7.4, the set \({f_{i,j}}^{-1}(\delta )\cap I\) has only finitely many connected components and thus there are only finitely many endpoints arising from those components. Since the set X is finite, this implies that \(\mathcal {R}_\delta ^1(\gamma _X)\) can have only finitely many points of discontinuity in I. Fix any point \(c\in \textbf{R}\) on which \(\mathcal {R}_\delta ^1(\gamma _X)\) is discontinuous. Consider the following two subsets of \(X\times X\):

$$\begin{aligned} A(c,\delta ):=\{(i,j):\ i<j\in X,\ d_X(c)(i,j)\le \delta \}, \end{aligned}$$
$$\begin{aligned} B(c,\delta ):=\{(i,j):\ i<j\in X,\ d_X(c)(i,j)>\delta \}. \end{aligned}$$

The continuity of \(d_X(\cdot )(i,j)\) for each \((i,j)\in X\times X\) guarantees that there exists \(\varepsilon >0\) such that

$$\begin{aligned} B(t,\delta )\supset B(c,\delta )\hspace{3mm} \text{ for } \text{ all }\ t\in (c-\varepsilon ,c+\varepsilon ) \end{aligned}$$

and in turn

$$\begin{aligned} A(t,\delta )\subset A(c,\delta )\hspace{3mm} \text{ for } \text{ all }\ t\in (c-\varepsilon ,c+\varepsilon ) \end{aligned}$$

since \(A(t,\delta )\cup B(t,\delta )=\{(i,j):i<j\in X\}\) for all \(t\in \textbf{R}.\) This implies that the graph \(\mathcal {R}_\delta ^1(\gamma _X(c))\) contains \(\mathcal {R}_\delta ^1(\gamma _X(t))\) as a subgraph for each \(t\in (c-\varepsilon ,c+\varepsilon )\). \(\square \)

1.6.2 A.6.2 Details from Section 7.3

Details about \(d_{\textrm{I},\lambda }^\textrm{dynM}\). We investigate further properties of the metrics in the family \(\left\{ d_{\textrm{I},\lambda }^\textrm{dynM}\right\} _{\lambda \in [0,\infty )}\). In particular, a discussion about stable invariants of DMSs with respect to the metrics \(d_{\textrm{I},\lambda }^\textrm{dynM}\) for \(\lambda >0\) can be found in [58].

Remark A.11

Let \(\lambda >0\). The distance \(d_{\textrm{I},\lambda }^\textrm{dynM}\) between any two bounded DMSs is finite. More specifically, for any r-bounded DMSs \(\gamma _X=(X,d_X(\cdot ))\) and \(\gamma _Y=(Y,d_Y(\cdot ))\) for some \(r>0\), any tripod R between X and Y is a \((\lambda , \frac{r}{\lambda })\)-tripod between \(\gamma _X\) and \(\gamma _Y\). This implies that

$$\begin{aligned} d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y)\le \frac{r}{\lambda }. \end{aligned}$$

Definition A.12

(Equivalent tripods) Let XY be any two sets. For any two tripods and between X and Y, we say that R and S are equivalent if \((x,y)\in R\) if and only if \((x,y)\in S\).

Remark A.13

Let \(\gamma _X=(X,d_X(\cdot ))\) and \(\gamma _Y=(Y,d_Y(\cdot ))\) be any two DMSs. Suppose that R and S are equivalent tripods between X and Y (cf. Definition A.12). Then, it is not difficult to check that for any \(\lambda ,\varepsilon \ge 0\), R is a \((\lambda ,\varepsilon )\)-tripod between \(\gamma _X\) and \(\gamma _Y\) if and only if S is a \((\lambda ,\varepsilon )\)-tripod between \(\gamma _X\) and \(\gamma _Y\).

Proof of Theorem 7.12

We prove the triangle inequality. Take any DMSs \(\gamma _X,\gamma _Y\) and \(\gamma _W\) over X,Y and W, respectively. For some \(\varepsilon ,\varepsilon '>0\), let and be any \((\lambda ,\varepsilon )\)-tripod between \(\gamma _X\) and \(\gamma _Y\) and \((\lambda ,\varepsilon ')\)-tripod between \(\gamma _Y\) and \(\gamma _W\) (cf. Definition 7.8), respectively. Consider the set \(Z:=\left\{ (z_1,z_2)\in Z_1\times Z_2:\varphi _Y(z_1)=\psi _Y(z_2)\right\} \) and let \(\pi _1:Z\rightarrow Z_1\) and \(\pi _2:Z\rightarrow Z_2\) be the canonical projections to the first and the second coordinate, respectively. Define the tripod \(R_2\circ R_1\) between X and W as in equation (2). It is not difficult to check that \(R_2\circ R_1\) is a \((\lambda ,\varepsilon +\varepsilon ')\)-tripod between \(\gamma _X\) and \(\gamma _W\) and thus we have \(d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _W)\le d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y)+ d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _Y,\gamma _W).\)

Next assume that \(d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y)=0\). We outline the proof of the fact that \(\gamma _X\) and \(\gamma _Y\) are isomorphic (cf. Definition 7.3). Because there are only finitely many tripods between X and Y up to equivalence (cf. Definition A.12), \(d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y)=0\) implies that there must be a certain tripod between X and Y such that R becomes an \((\lambda ,\varepsilon )\)-tripod between \(\gamma _X\) and \(\gamma _Y\) for any \(\varepsilon >0\). In order to show that \(\gamma _X\) and \(\gamma _Y\) are isomorphic, one needs to prove that that R is in fact \((\lambda ,0)\)-tripod. After that, invoke Definition 7.1, (ii) and (iii) to verify that the multivalued map \(\varphi _Y\circ \varphi _X^{-1}:X\rightrightarrows Y\) is in fact a bijection from X to Y.

Lastly, by Remark A.11, for \(\lambda >0\), \(d_{\textrm{I},\lambda }^\textrm{dynM}\) is finite between bounded DMSs. \(\square \)

Remark A.14

(For \(\lambda >0\), \(d_{\textrm{I},\lambda }^\textrm{dynM}\) generalizes the Gromov-Hausdorff distance) Let \(\lambda >0 \). Given any two constant DMSs \(\gamma _X\equiv (X,d_X)\) and \(\gamma _Y\equiv (Y,d_Y)\), the value \(d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y)\) equals the Gromov-Hausdorff distance between \((X,d_X)\) and \((Y,d_Y)\) up to multiplicative constant \(\frac{\lambda }{2}\). Indeed, for any tripod between X and Y, condition (15) reduces to

$$\begin{aligned} \left|d_X\left( \varphi _X(z),\varphi _X(z')\right) -d_Y\left( \varphi _Y(z),\varphi _Y(z')\right) \right|\le \lambda \varepsilon \ \ \text{ for } \text{ all } z,z'\in Z. \end{aligned}$$

Therefore,

$$\begin{aligned} d_\textrm{GH}((X,d_X),(Y,d_Y))=\frac{\lambda }{2}\cdot d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y). \end{aligned}$$

We have the following bilipschitz-equivalence relation between the metrics \(d_{\textrm{I},\lambda }^\textrm{dynM}\) for different \(\lambda >0\).

Proposition A.15

(Bilipschitz-equivalence) For all \(0<\lambda <\lambda '\),

$$\begin{aligned} d_{\textrm{I},\lambda '}^\textrm{dyn}\le d_{\textrm{I},\lambda }^\textrm{dynM}\le \frac{\lambda '}{\lambda }\cdot d_{\textrm{I},\lambda '}^\textrm{dyn}. \end{aligned}$$

Proof

Fix any two DMSs \(\gamma _X\) and \(\gamma _Y\) over X and Y. That \(d_{\textrm{I},\lambda '}^\textrm{dynM}(\gamma _X,\gamma _Y)\le d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y)\) follows from the observation that any \((\lambda ,\varepsilon )\)-tripod R between \(\gamma _X\) and \(\gamma _Y\) is also a \((\lambda ',\varepsilon )\)-tripod (cf. Definition 7.8). We next prove \(d_{\textrm{I},\lambda }^\textrm{dynM}(\gamma _X,\gamma _Y)\le \frac{\lambda '}{\lambda }\cdot d_{\textrm{I},\lambda '}^\textrm{dyn}(\gamma _X,\gamma _Y).\) For some \(\varepsilon \ge 0\) let R be any \((\lambda ',\varepsilon )\)-tripod between \(\gamma _X\) and \(\gamma _Y\). It suffices to show that R is also a \((\lambda ,\frac{\lambda '}{\lambda }\varepsilon )\)-tripod. Fix any \(t\in T.\) Then,

$$\begin{aligned} \bigvee _{[t]^{\left( \frac{\lambda '}{\lambda }\varepsilon \right) }}d_X\le \bigvee _{[t]^{\varepsilon }}d_X\le _R d_Y(t)+\lambda '\varepsilon =d_Y(t)+\lambda \left( \frac{\lambda '}{\lambda }\varepsilon \right) . \end{aligned}$$

By symmetry, we also have\(\bigvee _{[t]^{\left( \frac{\lambda '}{\lambda }\varepsilon \right) }}d_Y\le _{R}d_X(t)+\lambda \left( \frac{\lambda '}{\lambda }\varepsilon \right) ,\) as desired. \(\square \)

Appendix B Distance between weighted Reeb graphs

In this section we introduce a distance between weighted Reeb graphs which mediates between \(d_{\textrm{I}}^\textrm{F}\) and \(d_{\textrm{I}}^\textrm{R}\) (cf. Definition 2.15 and Table 3).

Proposition B.1

(Realization as an unlabeled formigram) For any weighted Reeb graph \(F:\textbf{Int}\rightarrow \omega \textbf{set}\), there exists an unlabeled formigram \(\theta :\textbf{Int}\rightarrow \textbf{Part}\) such that \(F\cong \mathcal {A}\circ \theta \).

The proof is rather trivial and thus we omit it. In general, a realization of a weighted Reeb graph as an unlabeled formigram is not unique; see Example 3.13. Proposition B.1 allows us to define the following dissimilarity measure on weighted Reeb graphs. From equation (3), recall how to define \(d_{\textrm{I}}^\textrm{F}\) between unlabeled formigrams. For all weighted Reeb graphs \(F,G:\textbf{Int}\rightarrow \omega \textbf{set}\), we define:

$$\begin{aligned}{} & {} W(F,G):=\inf \{\varepsilon \in [0,\infty ]: \text{ there } \text{ exist } \theta ,\theta ':\textbf{Int}\rightarrow \textbf{Part} \text{ s.t. }\nonumber \\{} & {} F\cong \mathcal {A}\circ \theta , \text{ and } G\cong \mathcal {A}\circ \theta ',\ d_{\textrm{I}}^\textrm{F}(\theta ,\theta ')=\varepsilon \}. \end{aligned}$$
(B3)

Since a realization of a weighted Reeb graph as an unlabeled formigram is not necessarily unique, we have to possibly take into account multiple realizations of F and G to compute W(FG). This leads to the fact that W does not necessarily satisfy the triangle inequality and thus we consider the maximal sub-dominant metric of W [20] as a metric on weighted Reeb graphs:

Definition B.2

(Metric on weighted Reeb graphs) For any two weighted Reeb graphs \(F,G:\textbf{Int}\rightarrow \omega \textbf{set}\),

$$\begin{aligned}d_{\textrm{I}}^{\omega \textrm{R}}(F,G)\!:=\!\inf \left\{ \sum _{i=0}^{m-1}\!W(F_i,F_{i+1}): F\!=\!F_1,\ldots ,F_m\!=\!G \ \text{ is } \text{ a } \text{ sequence } \text{ in } \textbf{Int}^{\omega \textbf{set}}\!\right\} .\end{aligned}$$

\(d_{\textrm{I}}^{\omega \textrm{R}}\) is the greatest metric on weighted Reeb graphs among those upper bounded by W. The metric \(d_{\textrm{I}}^{\omega \textrm{R}}\) mediates between \(d_{\textrm{I}}^\textrm{F}\) and \(d_{\textrm{I}}^\textrm{R}\):

Theorem B.3

For any two formigrams \(\theta _X\) and \(\theta _Y\), let \(\omega (\theta _X)\) and \(\omega (\theta _Y)\) be their weighted Reeb graphs. Then,

$$\begin{aligned} d_{\textrm{I}}^\textrm{R}(\textrm{Reeb}(\theta _X),\textrm{Reeb}(\theta _Y))\le d_{\textrm{I}}^{\omega \textrm{R}}(\omega (\theta _X),\omega (\theta _Y)) \le d_{\textrm{I}}^\textrm{F}(\theta _X,\theta _Y). \end{aligned}$$
(B4)

Proof

From the definition of \(d_{\textrm{I}}^\textrm{R}\) and Proposition 4.13, we have that

$$\begin{aligned}d_{\textrm{I}}^\textrm{R}(\textrm{Reeb}(\theta _X),\textrm{Reeb}(\theta _Y))\le W(\omega (\theta _X),\omega (\theta _Y)).\end{aligned}$$

Since \(d_{\textrm{I}}^{\omega \textrm{R}}\) is the greatest metric on weighted Reeb graphs among those upper bounded by W, the left inequality in (B4) follows. The right inequality in (B4) directly follows from the definition of \(d_{\textrm{I}}^{\omega \textrm{R}}\). \(\square \)

\(d_{\textrm{I}}^{\omega \textrm{R}}\) is strictly less discriminative than \(d_{\textrm{I}}^\textrm{F}\) whereas strictly more discriminative than \(d_{\textrm{I}}^\textrm{R}\):

Example B.4

  1. (i)

    Consider \(\theta _X\) and \(\theta _Y\) in Example 3.13. Since the underlying weighted Reeb graphs of \(\theta _X\) and \(\theta _Y\) are isomorphic, their distance in \(d_{\textrm{I}}^{\omega \textrm{R}}\) is zero. However, by Remark 4.12 (i), we have that \(d_{\textrm{I}}^\textrm{F}(\mathcal {U}_L^X\circ \theta _X,\mathcal {U}_L^Y\circ \theta _Y)=d_{\textrm{I}}^\textrm{F}(\theta _X,\theta _Y)>0\).

  2. (ii)

    Let F and G be weighted Reeb graphs depicted in Fig. 14. Their unweighted Reeb graphs \(\mathcal {A}\circ F\) and \(\mathcal {A}\circ G\) are clearly isomorphic and thus \(d_{\textrm{I}}^\textrm{R}(\mathcal {A}\circ F,\mathcal {A}\circ G)=0\). On the other hand, \(d_{\textrm{I}}^{\omega \textrm{R}}(F,G)= 1/2;\) this follows from the observation that both F and G are uniquely realized (up to natural isomorphism) by unlabeled formigrams \(\theta \) and \(\theta '\), which leads to \(d_{\textrm{I}}^{\omega \textrm{R}}(F,G)=d_{\textrm{I}}^\textrm{F}(\theta ,\theta ')\). Also, it is not difficult to check that \(d_{\textrm{I}}^\textrm{F}(\theta ,\theta ')=1/2\).

Fig. 14
figure 14

Two weighted Reeb graphs in Example B.4 (ii)

Appendix C Smoothing formigrams

The goal of this section is to establish a few basic properties of smoothing of formigrams. In particular, we reveal its effect on the zigzag barcodes of formigrams and its compatibility with smoothing of Reeb graphs in [32]; see Propositions C.2 and C.5.

Recall that a formigram \(\theta _X\) is regarded as either a cosheaf-inducing function \(\textbf{R}\rightarrow \textrm{SubPart}(X)\) or a constructible cosheaf \(\textbf{Int}\rightarrow \textrm{SubPart}(X)\) (cf. Definition 3.5, Remark 2.18 (i) and (ii)). By Definition 2.23 and Proposition 2.25, a smoothing operation on formigrams can be induced via the join operation on subpartitions. Namely, \(S_\varepsilon \theta _X\) sends each \(I\in \textbf{Int}\) to \(\bigvee _{I^\varepsilon }\theta _X:=\bigvee \{\theta _X(t):t\in I^\varepsilon \}\).

Remark C.1

(Comparison with robust grouping structure) The use of the join operation is an important element that distinguishes our notion of smoothing from the robust grouping structure in [15]. In particular, given a dynamic metric space (DMS), the induced formigram of this DMS (which is obtained by combining Definition 3.10 and Proposition 7.5) can be smoothed out using the join operation. We emphasize that this smoothing operation is intrinsic in contrast to the robust grouping structure from [15]. Namely, our smoothing operation can be carried out without constructing any topological space in the spatiotemporal ambient space of the DMS as illustrated in [15, Fig. 11]. One consequence of this ‘intrinsicality’ is that, when a dynamic graph is the input data (as opposed to a dynamic metric space), we can smooth out its induced formigram (cf. Definition 3.10), while [15] does not propose such a method. Since the coordinates of entities are not always available in applications (e.g. sensor networks [31, 33], low-cost swarm robots [75], etc.), this intrinsicality is a desirable property.

Given a formigram \(\theta _X\), Fig. 15 illustrates both the relationship between \(\textrm{Reeb}(\theta _X)\) and \(\textrm{Reeb}(S_\varepsilon \theta _X)\) and the relationship between their zigzag barcodes. The following proposition precisely describes the relationship between \({\bar{c}}(\theta _X)\) and \({\bar{c}}(S_\varepsilon \theta _X)\). For any \(r\in \textbf{R}\), we define \(-\infty +r\) to be \(-\infty \).

Fig. 15
figure 15

An illustration for Proposition C.2. Top: The Reeb graph of a formigram \(\theta _X\) and its barcode. Bottom: The Reeb graph of the formigram \(S_\varepsilon \theta _X\) and its barcode. Small loops in \(\textrm{Reeb}(\theta _X)\) disappear in \(\textrm{Reeb}(S_\varepsilon \theta _X)\). In the barcodes, bars with “[” on the left stand for half-closed intervals of the form [ab). Open intervals in \({\bar{c}}(\theta _X)\) that are shorter than \(2\varepsilon \) do not have corresponding intervals in \({\bar{c}}(S_\varepsilon \theta _X)\). Also, disbanding and merging events in \(\theta _X\) which do not correspond to vertices on small loops in \(\textrm{Reeb}(\theta _X)\) are replicated in \(S_\varepsilon \theta _X\): disbanding events in \(\theta _X\) are reflected in \(S_\varepsilon \theta _X\) but with delay \(\varepsilon \), whereas merging events in \(\theta _X\) are advanced by \(\varepsilon \). For example, observe from the graphs \(\textrm{Reeb}(\theta _X)\) and \(\textrm{Reeb}(S_\varepsilon \theta _X)\) that the disbanding event in \(\theta _X\) at \(t=t_0\) is delayed to \(t=t_0+\varepsilon \) in \(S_\varepsilon \theta _X\)

Proposition C.2

Let \(\theta _X\) be a formigram over X and let \(\varepsilon \ge 0\). Then, we have the following bijection between \({\bar{c}}(\theta _X)\) and \({\bar{c}}(S_\varepsilon \theta _X)\) (cf. Fig. 15):

(C5)

Recall the free functor \(\mathcal {F}:\textbf{set}\rightarrow \textbf{vec}\) (cf. Definition 2.6) and the fact that any constructible cosheaf \(\textbf{Int}\rightarrow \textbf{vec}\) is interval decomposable (cf. Proposition 2.19).

Lemma C.3

Given any constructible cosheaf \(F:\textbf{Int}\rightarrow \textbf{set}\), the barcode of \(\mathcal {F}\circ F:\textbf{Int}\rightarrow \textbf{vec}\) cannot include any interval of the form \([b,a]_{\textrm{BL}}:=\{(x,y)\in \textbf{Int}: x\le a <b\le y\}\) for \(a<b\) in \(\textbf{R}\) (cf. Fig. 16).

Proof

Since F is constructible is defined via colimits over restrictions of a zigzag diagram over \(\textbf{R}\) (cf. Definition 2.14), for any \(J\in \textbf{Int}\) and for any \(x\in F(J)\), there exist \(t\in J\) and \(y\in F([t,t])\) such that \(F([t,t]\subset J)(y)=x\). This property directly implies that the interval module \(I^{[b,a]_{\textrm{BL}}}\) cannot be a summand of \(\mathcal {F}\circ F\). \(\square \)

Proof of Proposition C.2

By Definitions 2.20 and 2.23, \({\bar{c}}(S_\varepsilon \theta _X)\) is equal to the multiset

$$\begin{aligned}\left\{ B\cap \textbf{R}_{y=x+2\varepsilon }: B\in \textrm{barc}(\mathcal {F}\circ F)\right\} . \end{aligned}$$

where \(\textbf{R}_{y=x+2\varepsilon }\) is the line \(y=x+2\varepsilon \) identified with the real line via the bijection \((r-\varepsilon ,r+\varepsilon )\leftrightarrow r\). The table (C5) is directly obtained by Lemma C.3 and the block decomposability of \(\mathcal {F}\circ \mathcal {C}\circ \theta _X\) [12, Sect. 3]. \(\square \)

Fig. 16
figure 16

An illustration of \([b,a]_{\textbf{ZZ}}\) for \(a<b\) in \(\textbf{Z}\)

The bijective correspondence of barcodes given in Proposition C.2 directly implies the following:

Corollary C.4

Let \(\theta _X\) be any formigram over X. Then, for \(\varepsilon \ge 0\),

$$\begin{aligned} d_\textrm{B}\left( {\bar{c}}\left( S_\varepsilon \theta _X\right) ,\ {\bar{c}}\left( \theta _X\right) \right) \le \varepsilon . \end{aligned}$$

The smoothing operations defined for formigrams and Reeb graphs (cf. Definition 2.23) are compatible in the following sense:

Proposition C.5

Let \(\theta _X\) be a formigram over X. Then, for any \(\varepsilon \ge 0\),

$$\begin{aligned} \textrm{Reeb}(S_\varepsilon \theta _X)=S_\varepsilon \textrm{Reeb}(\theta _X). \end{aligned}$$

Proof

Let \(I\in \textbf{Int}\). We have:

$$\begin{aligned} \textrm{Reeb}(S_\varepsilon \theta _X)(I)&=\left( \mathcal {C}\circ (S_\varepsilon \theta _X) \right) (I)&\text{ by } \text{ Definitions }\,2.9 \hbox { and }\,5.3\\ {}&=\mathcal {C}\circ \left( S_\varepsilon \theta _X\right) (I)&\text{ by } \text{ Proposition }\,5.4\\&=\mathcal {C}\circ \theta _X (I^\varepsilon )&\text{ by } \text{ Definition }\,2.23\\ {}&= \left( \mathcal {C}\circ \theta _X \right) (I^\varepsilon )&\text{ by } \text{ Proposition }\,5.4 \\&=S_\varepsilon \textrm{Reeb}(\theta _X)(I). \end{aligned}$$

\(\square \)

Formigrams change in a continuous manner under \(\varepsilon \)-smoothing:

Proposition C.6

For any \(\varepsilon \ge 0\) and any formigram \(\theta _X\),

$$\begin{aligned} d_{\textrm{I}}^\textrm{F}\left( S_\varepsilon \theta _X,\theta _X\right) \le \varepsilon . \end{aligned}$$

Proof

Consider the tripod and check that R is an \(\varepsilon \)-tripod between \(S_\varepsilon \theta _X\) and \(\theta _X\). \(\square \)

The following proposition is analogous to [32, Prop. 4.14]:

Proposition C.7

For any \(\varepsilon \ge 0\), \(S_\varepsilon \) is a contraction on formigrams, i.e. for any formigrams \(\theta _X\) and \(\theta _Y\)

$$\begin{aligned} d_{\textrm{I}}^\textrm{F}\left( S_\varepsilon \theta _X,S_\varepsilon \theta _Y\right) \le d_{\textrm{I}}^\textrm{F}\left( \theta _X,\theta _Y\right) . \end{aligned}$$

Proof

For \(\delta \ge 0\), assume that is a \(\delta \)-tripod between \(\theta _X\) and \(\theta _Y\). We claim that R is also a \(\delta \)-tripod between \(S_\varepsilon \theta _X\) and \(S_\varepsilon \theta _Y\). First, we remark that \(\varphi _X^*S_\varepsilon \theta _X=S_\varepsilon \varphi _X^*\theta _X\). Indeed, for any \(I\in \textbf{Int}\), \((\varphi _X^*S_\varepsilon \theta _X)(I)=\varphi _X^*(S_\varepsilon \theta _Y(I))=\varphi _X^*(\theta _X(I^\varepsilon ))=(S_\varepsilon \varphi _X^*\theta _X)(I)\). Therefore,

$$\begin{aligned} S_\delta \left( \varphi _X^*S_\varepsilon \theta _X\right)&= S_\delta \left( S_\varepsilon \varphi _X^*\theta _X\right) \\ {}&= S_{\varepsilon }\left( S_\delta \varphi _X^*\theta _X\right)&\text{ by } \text{ Remark }\,2.24\\ {}&\ge S_\varepsilon \varphi _Y^*\theta _Y&\text{ by } \text{ the } \text{ choice } \text{ of } R\\ {}&=\varphi _Y^*S_\varepsilon \theta _Y \end{aligned}$$

and by symmetry we have \(S_\delta \left( \varphi _Y^*S_\varepsilon \theta _Y\right) \ge \varphi _X^*S_\varepsilon \theta _X\), completing the proof. \(\square \)

Appendix D About the 0-slack interleaving distance between DMSs

We clarify the computational complexity of \(d_\textrm{I}^{\textrm{dynM}}\) (cf. Theorem 7.15) and provide a few examples of computing \(d_\textrm{I}^{\textrm{dynM}}\).

Computational complexity of \(d_\textrm{I}^{\textrm{dynM}}\).

We relate the Gromov-Hausdorff distance between two given ultrametric spaces to the interleaving distance \(d_\textrm{I}^{\textrm{dynM}}\) between certain DMSs induced by those ultrametric spaces. Then, invoking results from F. Schmiedl’s PhD thesis [76, 77] we obtain the claim of Theorem 7.15.

Given a ultrametric space \((X,u_X)\), define a DMS \({\mathcal {D}}(X,u_X):=(X,d_X(\cdot ))\) where for all \(x,x'\in X\) and for all \(t\in \textbf{R}\), \( d_X(t)(x, x'):=\max (0,u_X(x,x')-t)\). It is noteworthy that for any \(x,x'\in X\), \(d_X(\cdot )(x,x'):\textbf{R}\rightarrow \textbf{R}_+\) is decreasing down to zero and that \(d_X(0)=u_X\), a legitimate metric (i.e. not just pseudo-metric), satisfying the second item of Definition 7.1. Furthermore, note that \({\mathcal {D}}(X,u_X)\) is clearly piecewise linear and that the set of breakpoints is \(S_{{\mathcal {D}}(X,u_X)} = \{u_X(x,x'),\,x,x'\in X\}.\) Recall Definition A.2.

Proposition D.1

For any two ultrametric spaces \((X,u_X)\) and \((Y,u_Y)\) we have

$$\begin{aligned} d_\textrm{I}^{\textrm{dynM}}({\mathcal {D}}(X,u_X),{\mathcal {D}}(X,u_Y))=2\ d_{\textrm{GH}}((X,u_X),(Y,u_Y)). \end{aligned}$$

Proof

Let \({\mathcal {D}}(X,u_X)= (X,d_X(\cdot ))\) and \({\mathcal {D}}(Y,u_Y)=(Y,d_Y(\cdot ))\). Observe that for any \(x,x'\in X\), any \(t\in \textbf{R}\), and any \(\varepsilon \ge 0\), \(\min _{s\in [t]^\varepsilon }d_X(s)(x,x') = d_X(t+\varepsilon )(x,x')\) since \(d_X\) is decreasing over time. Thus, for some \(\varepsilon \ge 0\), a tripod is an \(\varepsilon \)-tripod between \((X,d_X), (Y,d_Y)\) (cf. Definition A.2) if and only if for all \(z,z'\in Z\) and for all \(t\in \textbf{R}\), \(d_X(t+\varepsilon )\left( \varphi _X(z),\varphi _X(z')\right) \le d_Y(t)\left( \varphi _Y(z),\varphi _Y(z')\right) \) and \(d_Y(t+\varepsilon )\left( \varphi _Y(z),\varphi _Y(z')\right) \le d_X(t)\left( \varphi _X(z),\varphi _X(z')\right) \), if and only if for all \(z,z'\in Z\) and for all \(t\in \textbf{R}\), \(\max \left( 0,u_X\left( \varphi _X(z),\varphi _X(z')\right) -t-\varepsilon \right) \le \max \left( 0,u_Y\left( \varphi _Y(z),\varphi _Y(z')\right) -t\right) \) and \(\max \left( 0,u_Y\left( \varphi _Y(z),\varphi _Y(z')\right) -t-\varepsilon \right) \le \max \left( 0,u_X\left( \varphi _X(z),\varphi _X(z')\right) -t\right) \) if and only if for all \(z,z'\in Z\),

$$\begin{aligned}\left|u_X\left( \varphi _X(z),\varphi _X(z')\right) -u_Y\left( \varphi _Y(z),\varphi _Y(z')\right) \right|\le \varepsilon ,\end{aligned}$$

completing the proof. \(\square \)

Proof of Theorem 7.15

Pick any two ultrametric spaces \((X,u_X)\) and \((Y,u_Y)\). Then, by Proposition D.1, the interleaving distance between \({\mathcal {D}}(X,u_X)\) and \({\mathcal {D}}(Y,u_Y)\) is identical to twice the Gromov-Hausdorff distance \(\Delta :=d_\textrm{GH}((X,u_X),(Y,u_Y))\) between \((X,u_X)\) and \((Y,u_Y)\). The rest of the proof follows along the same lines as that of Theorem 4.7. \(\square \)

Next we discuss a few computational examples of \(d_\textrm{I}^{\textrm{dynM}}\). Let \(\psi :\textbf{R}\rightarrow \textbf{R}_+\) be any non identically zero continuous function. Then, for any finite metric space \((X,d_X')\) we have the DMS \(\gamma _X^\psi = (X,d_X^\psi (\cdot ))\) where for \(t\in \textbf{R}\), \(d_X^\psi (t):=\psi (t)\cdot d_X'.\)

Fig. 17
figure 17

The interleaving condition. The thick blue curve and the thick red curve represent the graphs of \(\psi _0(t)=1+\cos (t)\) and \(\psi _1(t)=1+\cos (t+\pi /4),\) respectively. Fixing \(\varepsilon \ge 0\), define a function \(S_\varepsilon (\psi _0):\textbf{R}\rightarrow \textbf{R}\) by \(S_\varepsilon (\psi _0)(t):=\min _{s\in [t]^\varepsilon }\psi _0(s)\). The thin curves below the thick blue curve illustrate the graphs of \(S_\varepsilon (\psi _0)\) for several different choices of \(\varepsilon \). Note that for \(\varepsilon \ge \pi /4\simeq 0.785\), it holds that \(S_\varepsilon (\psi _0)\le \psi _1\)

Example D.2

(An interleaved pair of DMSs I)This example refers to Fig. 17. Fix the two-point metric space and consider two DMSs \(\gamma _X^{\psi _0}=(X,d_X^{\psi _0})\) and \(\gamma _X^{\psi _1}=(X,d_X^{\psi _1})\) where, for \(t\in \textbf{R}\), \(\psi _0(t)=1+\cos (t)\), \(\psi _1(t)=1+\cos (t+\pi /4)\). Then, \(\gamma _X^{\psi _0}\) and \(\gamma _X^{\psi _1}\) are \(\varepsilon \)-interleaved if and only if for \(i,j\in \{0,1\}\), \(i\ne j\), and for all \(t\in \textbf{R}\), \(S_\varepsilon (\psi _i)(t):=\min _{s\in [t]^\varepsilon }\psi _i(s)=\left( \bigvee _{[t]^\varepsilon }d_X^{\psi _i}\right) (x,x')\le d_X^{\psi _j}(t)(x,x')=\psi _j(t)\). In fact, this inequality holds if and only if \(\varepsilon \ge \pi /4\), and hence \(d_\textrm{I}^{\textrm{dynM}}\left( \gamma _X^{\psi _0},\gamma _X^{\psi _1}\right) =\pi /4\) (cf. Fig. 17).

The following example generalizes the previous one.

Example D.3

(An interleaved pair of DMSs II) Fix the two-point metric space and consider two DMSs \(\gamma _X^{\psi _0}=(X,d_X^{\psi _0})\) and \(\gamma _X^{\psi _1}=(X,d_X^{\psi _1})\) where, for \(t\in \textbf{R}\), \(\psi _0(t)=1+\cos (\omega t)\), \(\psi _1(t)=1+\cos (\omega (t+\tau ))\), for fixed \(\omega >0\) and \(0<\tau <\frac{2\pi }{\omega }\). Since in this case \(\psi _1(t) = \psi _0(t+\tau )\) for all t, one would expect that the interleaving distance between \(\gamma _X^{\psi _0}\) and \(\gamma _X^{\psi _1}\) is able to uncover the precise the value of \(\tau \). In this respect, we have: \( d_\textrm{I}^{\textrm{dynM}}(\gamma _X^{\psi _0},\gamma _X^{\psi _1})=\min \Big (\tau ,\ \frac{2\pi }{\omega }-\tau \Big )=:\eta (\omega ,\tau ). \)

Appendix E Higher dimensional persistent homology barcodes of dynamic metric spaces.

In this section we discuss extendibility of Theorem 7.14. The zigzag barcodes \({\bar{c}}(\theta _X)\) and \({\bar{c}}(\theta _Y)\) in Theorem 7.14 encodes the clustering behaviors of the given DMSs for a fixed scale \(\delta \ge 0\).

However, we do not need to restrict ourselves to clustering features of DMSs. Imagine that a flock of birds flies while keeping a circular arrangement from time \(t=0\) to \(t=1\). Regarding this flock as a DMS (trajectory data in \(\textbf{R}^3\)), we may want to have an interval containing [0, 1] in its 1-dimensional homology barcode. This idea can actually be implemented as follows.

For a fixed \(\delta \ge 0\), we substitute the Rips complex functor \(\mathcal {R}_\delta \) for the Rips graph functor \(\mathcal {R}_\delta ^1\) in Proposition 7.13. What we obtain is a dynamic simplicial complex or zigzag simplicial filtration, a generalization of Definition 3.1, induced from any tame DMS \(\gamma _X\). We then can apply the k-th homology functor to this zigzag simplicial filtration for each \(k\ge 0\) in order to obtain a \(\textbf{vec}\)-valued constructible cosheaf over \(\textbf{R}\). This zigzag module will be a signature summarizing the time evolution of k-dimensional homological features of \(\gamma _X\). By virtue of Proposition 2.19 we eventually obtain the k-th homology barcode \({\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (\gamma _X)\right) \right) \) of \(\gamma _X\) with respect to the fixed scale \(\delta \ge 0\); see also [35] for the computation of \({\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (\gamma _X)\right) \right) \) for various \(\delta \). In particular, the 0-th homology barcode of the resulting zigzag module coincides with \({\bar{c}}\left( \pi _0\left( \mathcal {R}_\delta ^1(\gamma _X)\right) \right) \) as defined in Theorem 7.14.

A natural question is then to ask whether our stability theorem (cf. Theorem 7.14) can be extended to higher dimensional homology barcodes:

Question E.1

For any pair of tame DMSs \(\gamma _X=(X,d_X(\cdot ))\) and \(\gamma _Y=(Y,d_Y(\cdot ))\), is it true that for any \(\delta \ge 0\) and for any \(k\ge 1\),

$$\begin{aligned} d_\textrm{B}\left( {\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (\gamma _X)\right) \right) , {\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (\gamma _Y)\right) \right) \right) \le 2\ d_\textrm{I}^{\textrm{dynM}}\left( \gamma _X,\gamma _Y\right) \ ? \end{aligned}$$

Interestingly, we found a family of counter-examples that indicates that stability, as expressed by Theorem 7.14, is a phenomenon which seems to be essentially tied to clustering (i.e. \(\textrm{H}_0\)) information.

Theorem E.2

For each integer \(k\ge 1\) there exist two different tame DMSs \(\gamma _{X_k}\) and \(\gamma _{Y_k}\), and \(\delta _k\ge 0\) such that \( d_\textrm{I}^{\textrm{dynM}}\left( \gamma _{X_k},\gamma _{Y_k}\right) <\infty \) but such that the bottleneck distance between the barcodes of \(\textrm{H}_k\left( {\mathcal {R}}_{\delta _k}\left( \gamma _{X_k}\right) \right) \) and \(\textrm{H}_k\left( {\mathcal {R}}_{\delta _k}\left( \gamma _{Y_k}\right) \right) \) is unbounded.

Fig. 18
figure 18

Pairs of DMSs \((\gamma _{X_i},\gamma _{X_i}')\) for \(i=1,2\) such that \(d_\textrm{I}^{\textrm{dynM}}\left( \gamma _{X_i},\gamma _{X_i}'\right) \le \pi /2\). In contrast, for \(k=1\) (or \(k=2\)), the bottleneck distance between their k-dimensional zigzag-persistence barcodes is infinite for \(\delta \in [\sqrt{2},2)\). DMS \(\gamma _{X_1}\), described as the left-most figure, (\(\gamma _{X_2}\), the third figure from the left) consists of four (eight) static points located at \(\pm e_1=(\pm 1,0,0)\) and \(\pm e_2=(0,\pm 1,0)\) (and \(\pm e_3=(0,0,\pm 1)\)), respectively. On the other hand, DMS \(\gamma _{X_1}'\), illustrated at the second from the left (\(\gamma _{X_2}'\), at the right-most), contains a single oscillating point, denoted by a star shape, with trace \((1+\sin ^2(t))e_1\) for \(t\in \textbf{R}\) along with three (five) static points located at \(-e_1, +e_2\) and \(-e_2\), (and \(\pm e_3\)), respectively. Then, the 1-dimensional (2-dimensional) zigzag-persistent homology barcode for \(\gamma _{X_1}\) (for \(\gamma _{X_2}\)) consists of exactly one interval \((-\infty ,\infty )\), indicating the presence of a loop (a void) for all time. However, the barcode of \(\gamma _{X_1}'\) (\(\gamma _{X_2}'\)) consists of an infinite number of ephemeral intervals \([n\pi ,n\pi ]\), \(n\in \textbf{Z}\), indicating the on-and-off presence of a loop (a void) that exists only at \(t=n\pi \) for \(n\in \textbf{Z}\) in its configuration

Proof

Fix any \(k\ge 1\). We will illustrate DMSs \(\gamma _{X_k}\) and \(\gamma _{Y_k}\) as collections of trajectories of points in \(\textbf{R}^{k+1}\), with the metric inherited from the Euclidean metric of \(\textbf{R}^{k+1}\) across all \(t\in \textbf{R}\). For \(k=1\) or \(k=2\), see Fig. 18.

Define \(\gamma _{X_k}\) to be the constant DMS consisting of \(2(k+1)\) points \(\pm e_i=(0,\ldots ,0,\pm 1,0,\ldots ,0)\in \textbf{R}^{k+1}\) for \(i=1,2,\ldots ,k+1\). On the other hand, define \(\gamma _{Y_k}\) to be obtained from \(\gamma _{X_k}\) by substituting the still point \(+e_1\) of \(\gamma _{X_k}\) by the oscillating point \((1+\sin ^2(t))e_1=(1+\sin ^2(t),0,\ldots ,0)\) for \(t\in \textbf{R}\).

It is not difficult to check that \(d_\textrm{I}^{\textrm{dynM}}\left( \gamma _{X_k},\gamma _{Y_k}\right) \le \pi /2\). However, with the connectivity parameter \(\delta =\sqrt{2}\), their barcodes of the k-th zigzag persistent homology are \({\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (X_k)\right) \right) =\{(-\infty ,\infty )\}\) and \({\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (Y_k)\right) \right) =\{[n\pi ,n\pi ]:n\in \textbf{Z}\}\), respectively. Therefore, \(d_\textrm{B}\left( {\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (X_k)\right) \right) , {\bar{c}}\left( \textrm{H}_k\left( \mathcal {R}_\delta (Y_k)\right) \right) \right) =+\infty .\) \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, W., Mémoli, F. Extracting Persistent Clusters in Dynamic Data via Möbius Inversion. Discrete Comput Geom 71, 1276–1342 (2024). https://doi.org/10.1007/s00454-023-00590-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00454-023-00590-1

Keywords

Mathematics Subject Classification

Navigation