Steady and ranging sets in graph persistence

Bergomi, Mattia G.; Ferri, Massimo; Tavaglione, Antonella

doi:10.1007/s41468-022-00099-1

Steady and ranging sets in graph persistence

Open access
Published: 23 September 2022

Volume 7, pages 33–56, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Applied and Computational Topology Aims and scope Submit manuscript

Steady and ranging sets in graph persistence

Download PDF

1812 Accesses
2 Citations
Explore all metrics

Abstract

Topological data analysis can provide insight on the structure of weighted graphs and digraphs. However, some properties underlying a given (di)graph are hardly mappable to simplicial complexes. We introduce steady and ranging sets: two standardized ways of producing persistence diagrams directly from graph-theoretical features. The two constructions are framed in the context of indexing-aware persistence functions. Furthermore, we introduce a sufficient condition for stability. Finally, we apply the steady- and ranging-based persistence constructions to toy examples and real-world applications.

Hierarchies and Ranks for Persistence Pairs

Persistent Homology over Directed Acyclic Graphs

Hochschild homology, and a persistent approach via connectivity digraphs

Article Open access 14 March 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Weighted graphs are a common data structure in many real-world scenarios. Recently, persistent homology became a widespread tool for data analysis, classification, comparison, and retrieval. However, this technique is by its very own nature limited to the analysis of weighted simplicial complexes. Although a graph is a one-dimensional complex, relevant information is not always carried by its topology, but, for instance, by graph-theoretical structures. A common choice to overcome this issue is to associate auxiliary simplicial complexes to the graph, see for instance (Bergomi et al. 2020). This strategy has been successfully applied in many interesting applications, e.g. (Petri et al. 2014; Lord et al. 2016; Reimann et al. 2017; Rieck et al. 2018; Sizemore et al. 2018; Chowdhury and Mémoli 2018; Port et al. 2018; Blevins and Bassett 2020; Anand et al. 2020).

It is possible to define and compute persistence in other categories than simplicial complexes or topological spaces (Bergomi and Vertechi 2020; Bergomi et al. 2021) and, in a different sense, Patel (2018), McCleary and Patel (2020), Kim and Mémoli (2021), and McCleary and Patel (2022). We introduce a further class of indexing-aware persistence functions (ip-functions), defined on $(\mathbb {R}, \le )$-indexed diagrams in a given category, that can be described via persistence diagrams. Additionally, we display a specific way of building ip-functions for filtered graphs and digraphs, introducing the concepts of steady and ranging sets.

We are rather far from the categorifications of Bubenik and Scott (2014), Lesnick (2015), Oudot (2015), de Silva et al. (2018): we aim to provide a simple and agile tool that can be applied directly to graphs (i.e., without mapping graphs to simplicial complexes), and possibly to other structures arising naturally from applications. The constructions derived from the framework we propose have a topological counterpart obtainable considering the simplicial complex associated with a poset (see Rem. 1 Bergomi et al. (2021)). Here, we show how to bypass that topological construction.

Section 1.1 briefly recalls the classical notions of persistence diagram and bottleneck distance. Section 2 focuses on graphs. First, we define ip-functions, and balanced ip-functions and discuss their stability. Then, we introduce steady and ranging sets as swift generators of ip-functions based directly on graph-theoretical features. These constructions are the theoretical core of the work. Thereafter, we apply them to study persistent Eulerian sets and monotone features on some elementary graphs. Section 2.5 showcases how the steady and ranging constructions can be leveraged in hub-detection tasks. Concrete applications follow in Sect. 3: we compute steady and ranging hubs in a network of airports, the character co-occurrence networks of Les Misérables and Game of Thrones, and a set of languages. Section 4 extends to weighted digraphs the theory developed in the previous sections. Code for application is available as a Python package at the repository https://github.com/MGBergomi/hubpersistence.git. The Appendix contains examples showing that most ip-functions of the paper are not balanced.

1.1 Persistence diagrams

The main object of study in persistent homology (Edelsbrunner and Harer 2008) are filtered spaces, i.e. pairs (X, f) where X is a topological space (e.g., the space of a simplicial complex) and $f:X \rightarrow \mathbb {R}$ is a map called filtering function: sublevel sets $X_u= f^{-1}\big ((-\infty , u]\big )$ are compared through homology morphisms induced by inclusion, in particular the so-called Persistent Betti Number functions. From such a function a persistence diagram (see Definition 1) can be built (Cohen-Steiner et al. (2007), Sect. 2). In turn, Persistent Betti Number functions can be recovered from the persistence diagram, Cohen-Steiner et al. (2007).

Persistence diagrams are the most widely used “fingerprints” of filtered spaces. The bottleneck distance between persistence diagrams yields an effective lower bound to distances between filtered spaces. This makes persistence diagrams a powerful tool in shape classification, analysis and retrieval. The strategic advantage of the generalisation started in Bergomi and Vertechi (2020), Bergomi et al. (2021) consists in the fact that also categorical persistence functions (Definition 4) can be represented by persistence diagrams: see Bergomi and Vertechi (2020, Sec. 3.9).

In $\mathbb {R}\times (\mathbb {R}\cup \{+\infty \})$ set $\varDelta =\{(u, v) \, | \, u=v\}$, $\varDelta ^+=\{(u,v) \, | \, u<v \}$ and $\bar{\varDelta }^+ = \varDelta \cup \varDelta ^+$. In a multiset, the multiplicity of an element will be the number of times that the element appears.

Definition 1

A persistence diagram D is a multiset of points of $\bar{\varDelta }^+$ where every point of the diagonal $\varDelta $ appears with infinite multiplicity.

The points of D belonging to $\varDelta ^+$ are called cornerpoints; they are said to be proper if both their coordinates are finite, cornerpoints at infinity otherwise. A persistence diagram is said to be finite if so is its set of cornerpoints. We shall only consider finite persistence diagrams.

Definition 2

Given persistence diagrams $D, D'$, let $\Gamma $ be the set of all bijections between D and $D'$. We define the bottleneck (formerly matching) distance as the real number

$$\begin{aligned} d(D, D') = \inf _{\gamma \in \Gamma } \sup _{p\in D} \Vert p-\gamma (p)\Vert _\infty \end{aligned}$$

First, this distance function checks the maximum displacement between corresponding points for a given matching either between cornerpoints of the two diagrams or cornerpoints and their projections on the diagonal $\varDelta $. Then, the minimum among these maxima is computed. Minima and maxima are actually attained because of the requested finiteness.

2 Graph-theoretical persistence

Let Graph be the category having finite simple undirected graphs as objects and injective simplicial applications as morphisms, seen as a subcategory of the category of finite simplicial complexes. In what follows, a graph will be considered as the pair of its vertex set and edge set, i.e. $G=(V, E)$, $G'=(V', E')$ and so on.

Definition 3

An $(\mathbb {R}, \le )$-indexed diagram is any functor from the category $(\mathbb {R}, \le )$ to an arbitrary category $\mathbf {C}$. $(\mathbb {R}, \le )$-indexed diagrams form a category, $\mathbf {C}^{(\mathbb {R}, \le )}$. The $(\mathbb {R}, \le )$-indexed diagram is said to be monic if all morphisms of its image are monomorphisms of $\mathbf {C}$.

We consider $(\mathbb {R}, \le )$-indexed diagrams in Graph that are constant on a finite set of left-closed, right-open intervals. Because of the choice of monomorphisms as the only acceptable morphisms, every such $(\mathbb {R}, \le )$-indexed diagram is monic, see Definition 3, and can be seen, up to natural isomorphisms, as a filtration of a graph G coming from a filtering function $f:V\cup E \rightarrow \mathbb {R}\cup \{+\infty \}$. Moreover, we shall limit our study to $(\mathbb {R}, \le )$-indexed diagrams whose associated filtration has no isolated vertices at any level. In other words, the filtering function f takes value $+\infty $ if a vertex is isolated, and the minimum of its values on the edges incident to the vertex, otherwise. Thus, f is determined by its restriction to E; therefore the weighted graphs considered here are pairs (G, f) with $f:E \rightarrow \mathbb {R}$. By construction, the subgraphs of the corresponding filtrations are induced by their edge sets.

Definition 4

Let ${\bar{\mathbf {C}}}$ be a category. A lower-bounded function $p:\text {Morph}({\bar{\mathbf {C}}}) \rightarrow \mathbb {Z}$ is a categorical persistence function if, for all $u_1 \rightarrow u_2 \rightarrow v_1 \rightarrow v_2$, the following inequalities hold:

1.
$p(u_1\rightarrow v_1) \le p(u_2\rightarrow v_1)$ and $p(u_2\rightarrow v_2) \le p(u_2\rightarrow v_1)$.
2.
$p(u_2\rightarrow v_1) - p(u_1\rightarrow v_1) \ge p(u_2\rightarrow v_2) - p(u_1\rightarrow v_2)$.

Remark 1

Such a function is categorical in the sense that it yields the same result to morphisms obtained from each other by composition with a ${\bar{\mathbf {C}}}$-isomorphism. For instance, we can retrieve the framework of classical topological persistence by setting ${\bar{\mathbf {C}}}= \mathbf {Vect}$ and p as the rank operator, i.e. the dimension of the image.

In what follows we focus on ${\bar{\mathbf {C}}}=(\mathbb {R}, \le )$. In this case a morphism $u\rightarrow v$ is simply the relation $u\le v$, which is represented as the point (u, v) in the persistence diagrams.

Definition 5

Let p be a map assigning to each monic $(\mathbb {R}, \le )$-indexed diagram M in a category $\mathbf {C}$ a categorical persistence function $p_M$ on $(\mathbb {R}, \le )$, such that $p_{M} = p_{M'}$ whenever a natural isomorphism between M and $M'$ exists. All the resulting categorical persistence functions $p_M$ are called indexing-aware persistence functions in $\mathbf {C}$ (ip-functions for brevity). The map p itself is called an ip-function generator.

Remark 2

An ip-function generator is actually a categorical function (in the sense of Remark 1) on the functor category $\mathbf {C}^{(\mathbb {R}, \le )}$ .

An ip-function in Graph (Definition 5) $p_M$, where M is an $(\mathbb {R}, \le )$-indexed diagram, will be denoted $p_{(G, f)}$, where M corresponds to the filtration produced by the weighted graph (G, f). The associated persistence diagram will be denoted by D(f), for the sake of simplicity and if no confusion may occur.

We can now observe that ip-functions are a particular case of categorical persistence functions in the category Graph. We recall that categorical persistence functions generalise Persistent Betti Number (PBN) functions. The difference between any of the categorical persistence functions introduced in Bergomi et al. (2021) and an ip-function defined here is that the former comes from a functor defined on Graph, while the latter strictly depends on the filtration, so comes from a functor defined on $(\mathbb {R}, \le )$.

Remark 3

The graph depicted in Fig. 1 shall be our running toy example along the entire manuscript. In the figure, we report the PBN functions of degree 0 and 1 to allow the reader to compare those classical results with the ones we shall obtain through ip-functions.

In Sect. 4, we extend the notions introduced above to the category of directed graphs.

2.1 Balanced ip-functions

The categorical functions introduced in Bergomi et al. (2021) are stable, i.e. the bottleneck distance between their persistence diagrams is a lower bound for their interleaving distance. The same does not automatically hold for ip-functions. However, we shall state a condition (Definition 6) which implies stability (as proved in Theorem 1). This condition corresponds to d’Amico et al. (2010, Proposition 10): there, it is proved for 0-degree PBNs, and from it the stability theorem (d’Amico et al. (2010), Theorem 29) follows through a sequence of lemmas; here, it is postulated.

Definition 6

Let p be a ip-function generator on Graph. The map p itself and the resulting ip-functions are said to be balanced if the following condition is satisfied. Let (G, f) and $(G', f')$ be any two weighted graphs, and $p_{(G, f)}$, $p_{(G', f')}$ their associated ip-functions. If an isomorphism $\psi :G\rightarrow G'$ and a positive real number h exist, such that $\sup _{e\in E} |f(e)-f'\big (\psi (e)\big )|\le h$, then for all $(u, v)\in \varDelta ^+$ the inequality $p_{(G, f)}(u-h, v+h)\le p_{(G', f')}(u, v)$ holds.

Let (G, f), $(G', f')$ be as above. Let also $\mathcal {H}$ be the (possibly empty) set of graph isomorphisms between G and $G'$. We can now take to Graph some definitions given in Frosini and Mulazzani (1999), d’Amico et al. (2010), and Lesnick (2015).

Definition 7

The natural pseudodistance of (G, f) and $(G', f')$ is

$$\begin{aligned} \delta \big ((G, f), (G', f')\big ) = \left\{ \begin{array}{l l} +\infty &{} \text {if} \ \ \ \mathcal {H}=\emptyset \\ \inf _{\phi \in \mathcal {H}}\sup _{e\in E} |f(e) - g\big (\phi (e)\big )| \ \ &{} \text {otherwise} \end{array} \right. \end{aligned}$$

Some simple adjustments of the proof of d’Amico et al. (2010, Theorem 29) and of its preceding lemmas yield the following theorem.

Theorem 1

(Stability) Let p be a balanced ip-function generator in Graph and $(G, f), (G', f')$ be two weighted graphs. Then we have

$$\begin{aligned} d\big (D(f), D(f')\big ) \le \delta \big ((G, f), (G', f')\big ), \end{aligned}$$

where D(f) and $D(f')$ are the persistence diagrams realized by the ip-functions $p_{(G, f)}$ and $p_{(G', f')}$ respectively. $\square $

Through Frosini et al. (2019, Theorem 5.8), this also implies stability with respect to the interleaving distance. Universality (Lesnick (2015), Sec. 5.2) is generally not granted for stable persistence functions: it needs ad hoc constructions.

When discussing stability above, we introduced two distinct graphs. However, the following proposition describes stability when considering a single graph and two filtering functions. This result will be useful in the remainder of the paper.

Proposition 1

The ip-function generator p is balanced if and only if the following condition is satisfied. Let $G=(V, E)$ be any graph, f and g be two filtering functions on G, and $p_{G, f)}$ and $p_{(G, g)}$ their ip-functions. If a positive real number h exists, such that $\sup _{e\in E}|f(e)-g(e)|\le h$, then for all $(u, v)\in \varDelta ^+$ the inequality $p_{(G,f)}(u-h, v+h) \le p_{(G, g)}(u, v)$ holds.

Proof

One of the two implications is immediate. The other is proved by the fact that $p_{(G, g)} = p_{(G', f')}$ where $g = f'\circ \psi $, with the notation of Definition 6. $\square $

Remark 4

The condition is symmetric: if it holds as in the statement of Proposition 1, then also $p_{(G, g)}(u-h, v+h) \le p_{(G, f)}(u, v)$ holds for all $(u, v)\in \varDelta ^+$.

2.2 Steady and ranging sets

Definition 8

Given a graph $G = (V, E)$, any function $\mathcal {F}:2^{V\cup E} \rightarrow \{true, false\}$ is called a feature. We call $\mathcal {F}$-set any $X\subset V\cup E$ such that $\mathcal {F}(X)= true$. Given a weighted graph (G, f) and a real number u, we denote by $G_u$ the subgraph of G induced by the edge set $f^{-1}(-\infty , u]$. We shall say that $X\subset V\cup E$ is an $\mathcal {F}$-set at level $w\in \mathbb {R}$ if it is an $\mathcal {F}$-set of the subgraph $G_w$.

Definition 9

Let $\mathcal {F}$ be a feature of G. We define the maximal feature $m\mathcal {F}$ associated with $\mathcal {F}$ as follows: for any $X \subseteq (V \cup E)$, $m\mathcal {F}(X) = true$ if and only if $\mathcal {F}(X)=true$ and there is no $Y \subseteq (V \cup E)$ such that $X \subset Y$ and $\mathcal {F}(Y)=true$.

Definition 10

Let $\mathcal {F}$ be a feature. A set $X\subseteq V\cup E$ is a steady $ \mathcal {F}$-set (s$\mathcal {F}$-set for brevity) at $(u, v) \in \varDelta ^+$ if it is an $\mathcal {F}$-set at all levels w with $u\le w \le v$. We call X a ranging $\mathcal {F}$-set (r$\mathcal {F}$-set) at (u, v) if there exist levels $w\le u$ and $w'\ge v$ at which it is an $\mathcal {F}$-set.

Let $S^\mathcal {F}_{(G, f)}(u,v)$ be the set of s$\mathcal {F}$-sets at (u, v) and let $R^\mathcal {F}_{(G, f)}(u,v)$ be the set of r$\mathcal {F}$-sets at (u, v).

Remark 5

Intuitively, the adjective “steady” stresses that a steady set enjoys a given feature $\mathcal {F}$ throughout the entire interval [u, v]. “Ranging”, instead, refers to the fact that a ranging set spans, with feature $\mathcal {F}$, the range [u, v] although possibly with gaps. Of course, steady implies ranging. This implication is granted by the “$\le $” and “$\ge $” signs in the definitions. With strict inequalities the implication fails. There are features for which steady is equivalent to ranging, e.g., features for which a set can be an $\mathcal {F}$-set only in a (possibly unbounded) interval. A simple example is the feature $\mathcal {F}$ which assigns true only to singletons consisting of a vertex of a fixed degree.

Lemma 1

If $u\le u' < v' \le v$, then

1.
$S^\mathcal {F}_{(G, f)}(u,v) \subseteq S^\mathcal {F}_{(G, f)}(u',v')$
2.
$R^\mathcal {F}_{(G, f)}(u,v) \subseteq R^\mathcal {F}_{(G, f)}(u',v')$

where the equalities hold if $G_u = G_{u'}$ and $G_v = G_{v'}$. Moreover $S^\mathcal {F}_{(G, f)}(u,v) = \emptyset = R^\mathcal {F}_{(G, f)}(u, v)$ if $G_u =\emptyset $.

Proof

By the definitions themselves of steady and ranging $\mathcal {F}$-set. $\square $

Definition 11

Let $\mathcal {F}$ be a feature. For any graph G, for any filtering function $f:E \rightarrow \mathbb {R}$, we define $\sigma ^\mathcal {F}_{(G, f)}: \varDelta ^+ \rightarrow \mathbb {Z}$ as the function which assigns to $(u, v) \in \varDelta ^+$ the number $|S^\mathcal {F}_{(G, f)}(u,v)|$ and $\varrho ^\mathcal {F}_{(G, f)}: \varDelta ^+ \rightarrow \mathbb {Z}$ as the function which assigns to $(u, v) \in \varDelta ^+$ the number $|R^\mathcal {F}_{(G, f)}(u,v)|$. We denote by $\sigma ^\mathcal {F}$ and $\varrho ^\mathcal {F}$ the maps assigning $\sigma ^\mathcal {F}_{(G, f)}$ and $\varrho ^\mathcal {F}_{(G, f)}$ respectively to the $(\mathbb {R}, \le )$-indexed diagram corresponding to (G, f).

Proposition 2

The maps $\sigma ^\mathcal {F}$ and $\varrho ^\mathcal {F}$ are ip-function generators.

Proof

We prove conditions 1 and 2 of Definition 4, recalling that the source category is $(\mathbb {R}, \le )$, so the existence of a morphism $u \rightarrow v$ (with $u\ne v$) simply means that $u<v$. Assume $u_1<u_2<v_1<v_2$. Let (G, f) be any weighted graph.

(Condition 1 for $\sigma ^\mathcal {F}$) By Lemma 1, $S^\mathcal {F}_{(G, f)}(u_1, v_1) \subseteq S^\mathcal {F}_{(G, f)}(u_2, v_1)$, so $|S^\mathcal {F}_{(G, f)}(u_1, v_1)| \le |S^\mathcal {F}_{(G, f)}(u_2, v_1)|$. Also $S^\mathcal {F}_{(G, f)}(u_2, v_2) \subseteq S^\mathcal {F}_{(G, f)}(u_2, v_1)$ and $|S^\mathcal {F}_{(G, f)}(u_2, v_2)| \le |S^\mathcal {F}_{(G, f)}(u_2, v_1)|$.
(Condition 2 for $\sigma ^\mathcal {F}$) By Lemma 1, $S^\mathcal {F}_{(G, f)}(u_1, v_1) \subseteq S^\mathcal {F}_{(G, f)}(u_2, v_1)$, so $|S^\mathcal {F}_{(G, f)}(u_2, v_1)| - |S^\mathcal {F}_{(G, f)}(u_1, v_1)|$ is the number of s$\mathcal {F}$-sets at $(u_2, v_1)$ which fail to be $\mathcal {F}$-sets at some w with $u_1\le w \le u_2$. Analogously for $|S^\mathcal {F}_{(G, f)}(u_2, v_2)| - |S^\mathcal {F}_{(G, f)}(u_1, v_2)|$. Now, every s$\mathcal {F}$-set at $(u_2, v_2)$ which fails to be an $\mathcal {F}$-set at w with $u_1\le w \le u_2$ is also an s$\mathcal {F}$-set at $(u_2, v_1)$ failing at the same w. So $S^\mathcal {F}_{(G, f)}(u_2, v_1) - S^\mathcal {F}_{(G, f)}(u_1, v_1) \supseteq S^\mathcal {F}_{(G, f)}(u_2, v_2) - S^\mathcal {F}_{(G, f)}(u_1, v_2)$ and $|S^\mathcal {F}_{(G, f)}(u_2, v_1)| - |S^\mathcal {F}_{(G, f)}(u_1, v_1)| \ge |S^\mathcal {F}_{(G, f)}(u_2, v_2)| - |S^\mathcal {F}_{(G, f)}(u_1, v_2)|$.
(Condition 1 for $\varrho ^\mathcal {F}$) The argument is the same as for $\sigma ^\mathcal {F}$.
(Condition 2 for $\varrho ^\mathcal {F}$) By Lemma 1, $R^\mathcal {F}_{(G, f)}(u_1, v_1) \subseteq R^\mathcal {F}_{(G, f)}(u_2, v_1)$, so $|R^\mathcal {F}_{(G, f)}(u_2, v_1)| - |R^\mathcal {F}_{(G, f)}(u_1, v_1)|$ is the number of r$\mathcal {F}$-sets at $(u_2, v_1)$ which fail to be $\mathcal {F}$-sets at all levels w with $w \le u_1$. Analogously for $|R^\mathcal {F}_{(G, f)}(u_2, v_2)| - |R^\mathcal {F}_{(G, f)}(u_1, v_2)|$. Now, every r$\mathcal {F}$-set at $(u_2, v_2)$ which fails to be an $\mathcal {F}$-set at all levels w with $w \le u_1$ is also an r$\mathcal {F}$-set at $(u_2, v_1)$ failing at the same levels w. So $R^\mathcal {F}_{(G, f)}(u_2, v_1) - R^\mathcal {F}_{(G, f)}(u_1, v_1) \supseteq R^\mathcal {F}_{(G, f)}(u_2, v_2) - R^\mathcal {F}_{(G, f)}(u_1, v_2)$ and $|R^\mathcal {F}_{(G, f)}(u_2, v_1)| - |R^\mathcal {F}_{(G, f)}(u_1, v_1)| \ge |R^\mathcal {F}_{(G, f)}(u_2, v_2)| - |R^\mathcal {F}_{(G, f)}(u_1, v_2)|$.

$\square $

The value of both functions $\sigma ^\mathcal {F}_{(G, f)}$ and $\varrho ^\mathcal {F}_{(G, f)}$ at a point P on a vertical (resp. horizontal) discontinuity line is the same as the value at the points in a right (resp. upper) neighbourhood of P

Of course, there are many features which give valid ip-functions: eg. the features $\mathcal {F}$ such that, if X is an $\mathcal {F}$-set at level u, then it is an $\mathcal {F}$-set also at level v for all $v>u$.

We still don’t know which general hypothesis on $\mathcal {F}$ would imply that $\sigma ^\mathcal {F}$ or $\varrho ^\mathcal {F}$ are balanced ip-function generators (Definition 6). Such features exist: Sect. 2.4 presents a whole class of features giving rise to balanced ip-functions.

2.3 Steady and ranging persistence on Eulerian sets

We now give an example of the framework exposed in Sect. 2.2. Given any graph G, we define $\mathcal{EU}\mathcal{}: 2^{V\cup E} \rightarrow \{true, false\}$ to yield true on a set A if and only if A is a set of vertices whose induced subgraph of G is nonempty, Eulerian and maximal with respect to these properties; in that case A is said to be a $\mathcal{EU}\mathcal{}$-set of G. $\mathcal{EU}\mathcal{}$ is then the maximal version of a feature we are not going to deal with. Let now (G, f) be a weighted graph. We apply Definition 10 to feature $\mathcal{EU}\mathcal{}$.

Definition 12

For any real number w, the subset $A\subseteq V$ is a $\mathcal{EU}\mathcal{}$-set at level w if it is a $\mathcal{EU}\mathcal{}$-set of the subgraph $G_w$. It is a steady $\mathcal{EU}\mathcal{}$-set (an s$\mathcal{EU}\mathcal{}$-set) at $(u, v) \in \varDelta ^+$ if it is a $\mathcal{EU}\mathcal{}$-set at all levels w with $u\le w \le v$. It is a ranging $\mathcal{EU}\mathcal{}$-set (an r$\mathcal{EU}\mathcal{}$-set) at (u, v) if there exist levels $w\le u$ and $w'\ge v$ at which it is a $\mathcal{EU}\mathcal{}$-set.

$S^{\mathcal{EU}\mathcal{}}_{(G, f)}(u, v)$ and $R^{\mathcal{EU}\mathcal{}}_{(G, f)}(u, v)$ are respectively the sets of s$\mathcal{EU}\mathcal{}$-sets and of r$\mathcal{EU}\mathcal{}$-sets at (u, v). We define $\sigma ^\mathcal{EU}\mathcal{}_{(G, f)}: \varDelta ^+ \rightarrow \mathbb {R}$ as the function which assigns to $(u, v) \in \varDelta ^+$ the number $|S^\mathcal{EU}\mathcal{}_{(G, f)}(u,v)|$ and $\varrho ^\mathcal{EU}\mathcal{}_{(G, f)}: \varDelta ^+ \rightarrow \mathbb {R}$ as the function which assigns to $(u, v) \in \varDelta ^+$ the number $|R^\mathcal{EU}\mathcal{}_{(G, f)}(u,v)|$.

We denote by $\sigma ^\mathcal{EU}\mathcal{}$ and $\varrho ^\mathcal{EU}\mathcal{}$ the maps assigning $\sigma ^\mathcal{EU}\mathcal{}_{(G, f)}$ and $\varrho ^\mathcal{EU}\mathcal{}_{(G, f)}$ respectively to the $(\mathbb {R}, \le )$-indexed diagram corresponding to (G, f). By Proposition 2, $\sigma ^\mathcal{EU}\mathcal{}$ and $\varrho ^\mathcal{EU}\mathcal{}$ are ip-function generators.

Consider the example displayed in Fig. 1. In that particular example, the functions $\sigma ^\mathcal{EU}\mathcal{}_{(G, f)}$ and $\varrho ^\mathcal{EU}\mathcal{}_{(G, f)}$ are the same. Furthermore, they also coincide with the PBN function in degree 1 shown in the same figure. We show that this is not always the case in Fig. 2.

Both functions $\sigma ^\mathcal{EU}\mathcal{}$ and $\varrho ^\mathcal{EU}\mathcal{}$ are not balanced (see the Appendix).

2.4 Monotone features

For a given graph $G = (V, E)$, we shall consider as subgraphs only the ones induced by sets of edges. The next definition is a variation on the notion of monotone (sometimes dubbed hereditary) property defined in Alon and Shapira (2008).

Definition 13

We say that a feature $\mathcal {F}$ is monotone if

For any graphs $G' =(V', E')\subset G''=(V'', E'')$, and any $X \subseteq (V' \cup E')$, $\mathcal {F}(X) = true$ in $G''$ implies $\mathcal {F}(X)= true$ in $G'$
In any graph $\overline{G}=(\overline{V}, \overline{E})$, for any $Y \subset X \subseteq \overline{V} \cup \overline{E}$, $\mathcal {F}(X) = true$ implies $\mathcal {F}(Y)= true$.

A paradigmatic monotone feature is independence: independent (or stable) sets and matchings are examples of sets of vertices, respectively of edges, with monotone features.

For the remainder of this section, let (G, f) be a weighted graph, $G=(V, E)$, and $\mathcal {F}$ a monotone feature in G. By Proposition 2, $\sigma ^\mathcal {F}$ and $\varrho ^\mathcal {F}$ are ip-function generators.

Lemma 2

Let $X \subseteq (V \cup E)$. Then, either there is no value u for which $\mathcal {F}(X)=true$ in $G_u$, or $\mathcal {F}(X)=true$ in $G_u$ for all $u \in [u_1, v_1)$, where $u_1$ is the lowest value u such that in the subgraph $G_u=(V_u, E_u)$ one has $X \subseteq (V_u \cup E_u)$, and $v_1$ is either the lowest value v for which $\mathcal {F}(X)=false$ in $G_v$ or $+\infty $.

Proof

Assume that $\mathcal {F}(X)=true$ in $G_u$ for at least one value u. If $\mathcal {F}(X)=true$ in $G_u$, then $\mathcal {F}(X)=true$ in $G_{u'}=(V_{u'}, E_{u'})$ for all $u'<u$ such that $X\subseteq (V_{u'} \cup E_{u'})$ by Definition 13. $\square $

The interval $[u_1, v_1)$ of Lemma 2, i.e. the widest interval for which $\mathcal {F}(X)=true$ in (G, f), is called the $\mathcal {F}$-interval of X in (G, f).

Proposition 3

$\sigma ^\mathcal {F} =\varrho ^\mathcal {F}$

Proof

By Lemma 2. $\square $

Let now g be another filtering function on G; in order to avoid confusion, for each real number u, we denote by $G_{f, u}$ (resp. $G_{g, u}$) the subgraph of G induced by the edge set $f^{-1}\big ((-\infty , u]\big )$ (resp. $g^{-1}\big ((-\infty , u]\big )$).

Lemma 3

Assume that there exists a positive real h such that $\sup _{e\in E}|f(e)-g(e)|\le h$. Assume also that $X \subseteq (V \cup E)$ exists, such that $u \in [u_1, v_1)$ is its $\mathcal {F}$-interval in G, f), with $u_1+2h<v_1<+\infty $. Then there is a non-empty $\mathcal {F}$-interval $[u_2, v_2)$ of X in (G, g), and $|u_1-u_2|\le h, |v_1-v_2|\le h$.

Proof

Assume that, for $e\in E$, $f(e)=u$; then $g(e)\le u+h$. This proves that, for each u, $G_{f, u}$ is a subgraph of $G_{g, u+h}$. Swapping the roles, also $G_{g, u}$ is a subgraph of $G_{f, u+h}$.

Therefore, if X exists in $G_{f, u}$ it also exists in $G_{g, u+h}$. Symmetrically, if X exists in $G_{g, u}$ it also exists in $G_{f, u+h}$. Recalling, by Lemma 2, the meaning of $u_1$ and, correspondingly, $u_2$, we obtain that $|u_1-u_2|\le h$.

If $\mathcal {F}(X)=true$ in $G_{f, u+h}$, then $\mathcal {F}(X)=true$ also in the subgraph $G_{g, u}$ because $\mathcal {F}$ is monotone. Analogously, $\mathcal {F}(X)=true$ in $G_{g, u+h}$ implies $\mathcal {F}(X)=true$ in $G_{f, u}$. Recalling, by Lemma 2, the meaning of $v_1$ and, correspondingly, of $v_2$, we obtain that $|v_1-v_2|\le h$. $\square $

Proposition 4

The ip-function generators $\sigma ^\mathcal {F} =\varrho ^\mathcal {F}$ are balanced.

Proof

We shall prove for $\sigma ^\mathcal {F}$ (and consequently for $\varrho ^\mathcal {F}$, by Proposition 3) the property stated in Proposition 1. With the notation and the assumptions of Lemma 2, assume that for $u<v$ we have $\sigma ^\mathcal {F}_{(G, f)}(u-h, v+h)>0$ (if it vanishes the claim is trivially true). We want to show that $\sigma _{(G, f)}^\mathcal {F}(u-h, v+h)\le \sigma ^\mathcal {F}_{(G, g)}(u, v)$. Let $X \subseteq (V \cup E)$ be such that $\mathcal {F}(X)= true$ in $G_{f, w}$ for all $w \in [u-h, v+h]$. Then, for the $\mathcal {F}$-interval $[u_1, v_1)$ of X in (G, f) we have $u_1\le u-h$, $v+h<v_1$. The $\mathcal {F}$-interval of the same X in (G, g) is $[u_2, v_2)$, with $|u_1-u_2|\le h$, $|v_1-v_2|\le h$ by Lemma 3. So, $u_2\le u_1+h \le u-h+h = u$ and $v= v+h-h < v_1-h \le v_2$, i.e. [u, v] is contained in the $\mathcal {F}$-interval of X in (G, g) and $\mathcal {F}(X)= true$ in $G_{g, w}$ for all $w\in [u,v]$. Therefore, an injective map exists from $S^\mathcal {F}_{(G,f)}(u-h, v+h)$ to $S^\mathcal {F}_{(G, g)}(u, v)$, proving that $\sigma ^\mathcal {F}_{(G, f)}(u-h, v+h) \le \sigma ^\mathcal {F}_{(G, g)}(u, v)$. $\square $

Monotone features—although balanced—often give rise to extremely rich persistence diagrams. For this reason, it is possible to consider instead the maximal version (that could be non-balanced) of those features. In Fig. 3, we show how maximal independent sets give rise to complex persistence diagrams, even considering as graph our running toy example (the one shown originally in Fig. 1). For the monotone feature $\mathcal {I}$ which identifies independent sets of vertices, $m\mathcal {I}$ is not balanced (see the Appendix).

Anyway, the maximal version of the feature $\mathcal {M}$, which identifies matchings, produces balanced ip-function generators (Proposition 5). See Fig. 4 for the functions $\sigma ^{m\mathcal {M}}$ and $\varrho ^{m\mathcal {M}}$ of the usual example of Fig. 1.

Proposition 5

The ip-function generators $\sigma ^{m\mathcal {M}}$ and $\varrho ^{m\mathcal {M}}$ coincide and are balanced.

Proof

If the edge set X is a matching in a graph, it is a matching in all supergraphs. In a weighted graph (G, f), the set of levels w such that an edge set X is a maximal matching in $G_w = (V_w, E_w)$ is either empty or the interval $[u_2, v_2)$ where $u_1$ is the left end-point of the $\mathcal {M}$-interval of X and $v_2$ is either $+\infty $ or the left end-point of the $\mathcal {M}$-interval of a matching Y containing X. This proves that $\sigma ^{m\mathcal {M}}_{(G, f)} = \varrho ^{m\mathcal {M}}_{(G, f)}$.

Let now g be another filtering function on G, such that $\sup _{e\in E}|f(e)-g(e)|\le h$, with $h>0$. Assume that the interval $[u_2, v_2)$ on which X is a maximal matching is such that $u_2+2h< v_2 < +\infty $. Then, by Lemma 3, for the left end-point $u_3$ of the $\mathcal {M}$-interval of X in (G, g) and the left end-point $v_3$ of the $\mathcal {M}$-interval of Y in (G, g) one has $|u_2-u_3|\le h, |v_2-v_3|\le h$. So, if X belongs to $S^{m\mathcal {M}}_{(G,f)}(u-h, v+h)$, it also belongs to $S^{m\mathcal {M}}_{(G, g)}(u, v)$, proving that $\sigma ^{m\mathcal {M}}_{(G, f)}(u-h, v+h) \le \sigma ^{m\mathcal {M}}_{(G, g)}(u, v)$. $\square $

2.5 Hubs

Although the informal concept of hub is intuitively clear, it is not as easy to formalize in graph-theoretical terms. The simple idea of a vertex with (locally) maximum degree is not entirely satisfactory: in a social network it is common to find users with a lot of contacts, with whom, however, they interact poorly. Even a high sum of traffic intensities (e.g. the number of messages exchanged between a user and their connections) is not enough to bestow a vertex the central role implied by the word hub.

There is an important line of research on a probabilistic concept of “persistent hubs” based on degree maximality (Dereich and Mörters 2009; Galashin 2016; Banerjee and Bhamidi 2021) with some intersection with what we are proposing.

We shall use local degree prevalence as feature for building two ip-function generators: for any graph G we define $\mathcal {H}:2^{V\cup E} \rightarrow \{true, false\}$ to yield true only on singletons containing a vertex whose degree is greater than the ones of its neighbours. Such a vertex is called an $\mathcal {H}$-vertex or simply a hub. This feature, combined with the indexing-aware persistence framework and the notion of ranging and steady feature, allows for the identification of those vertices whose role is indeed central throughout the filtration of a given weighted graph (G, f).

Importantly, we preserve the flexibility granted in the realm of classical persistence: as one of the many possible variations, we could consider a vertex to be a hub if the sum of values of f on the edges incident to it (instead of the degree) is greater then the sum at its neighbours.

Our proposal is to build persistence diagrams in our generalized framework, and thereafter use the selection procedure presented in Kurlin (2016) (see 3.1) to identify relevant cornerpoints, thus identifying the “persistent” hubs (with a different meaning of the adjective than in Dereich and Mörters (2009), Galashin (2016), Banerjee and Bhamidi (2021)) of a given weighted graph.

Definition 14

For any real number w, a vertex is a hub (or $\mathcal {H}$-vertex) at level w if it is an $\mathcal {H}$-vertex of the subgraph $G_w$. It is a steady hub (or s$\mathcal {H}$-vertex) at $(u, v)\in \varDelta ^+$ if it is an $\mathcal {H}$-vertex at all levels w with $u\le w\le v$. It is a ranging hub (or r$\mathcal {H}$-vertex) at $(u, v)\in \varDelta ^+$ if there exist levels $w \le u$ and $w'\ge v$ at which it is an $\mathcal {H}$-vertex.

$S^\mathcal {H}_{(G, f)}(u, v)$ and $R^\mathcal {H}_{(G, f)}(u, v)$ are respectively the sets of s$\mathcal {H}$-vertices and of r$\mathcal {H}$-vertices at (u, v). We define $\sigma ^\mathcal {H}_{(G, f)}: \varDelta ^+ \rightarrow \mathbb {R}$ as the function which assigns to $(u, v) \in \varDelta ^+$ the number $|S^\mathcal {H}_{(G, f)}(u,v)|$ and $\varrho ^\mathcal {H}_{(G, f)}: \varDelta ^+ \rightarrow \mathbb {R}$ as the function which assigns to $(u, v) \in \varDelta ^+$ the number $|R^\mathcal {H}_{(G, f)}(u,v)|$.

We denote by $\sigma ^\mathcal {H}$ and $\varrho ^\mathcal {H}$ the maps assigning $\sigma ^\mathcal {H}_{(G, f)}$ and $\varrho ^\mathcal {H}_{(G, f)}$ respectively to the $(\mathbb {R}, \le )$-indexed diagram corresponding to (G, f).

Figure 5 shows the two ip-functions $\sigma ^\mathcal {H}$ and $\varrho ^\mathcal {H}$ for the usual example of Fig. 1. Also $\sigma ^\mathcal {H}$ and $\varrho ^\mathcal {H}$ are not balanced (see the Appendix).

3 Persistent hubs

In this Section we present a first approach to hub detection implementable on real-world graphs. We consider this work in progress a sort of exploration of the meaning of steady and ranging hubs in different contexts; however, we will not compare our results to a ground truth.

In the following examples, instead of the functions $\sigma ^\mathcal {H}_{(G, f)}$ and $\varrho ^\mathcal {H}_{(G, f)}$, we will only show the corresponding persistence diagrams, to make the selection procedure clearer.

3.1 A selection procedure

It is well-known in persistence that noise is represented by cornerpoints close to the diagonal $\varDelta $. However, not all cornerpoints close to $\varDelta $ necessarily represent noise, then how wide is the strip along $\varDelta $ to get rid of? A smart, simple answer is offered in Kurlin (2016), where a remarkable application to segmentation of very noisy data is given. We summarize it here for a given persistence diagram D.

Call diagonal gap a maximal region of the form $\{(u,v) \in \varDelta ^+ \, | \, a<v-a<b\}$ where no cornerpoints of D lie; $b-a$ is its width. We can then form a hierarchy of diagonal gaps by decreasing width; out of it we get a hierarchy of sets of cornerpoints: We can consider the cornerpoints lying above the first, widest gap as the most relevant. Empirically, we may decide that also the cornerpoints sitting above the second, or the third widest gap are relevant, and so on. Equivalently, we consider the cornerpoints below the chosen gap to be ignored as a possible result of noise. In Fig. 6 it is possible to observe how the selection of cornerpoints above the widest diagonal gap allows to traceback those maxima (or classes of maxima depending on the multiplicity of the cornerpoints), that are more relevant with respect to the trend of the time series.

In the next Sections we apply this selection criterion to the persistence diagrams corresponding to the functions $\sigma ^\mathcal {H}_{(G, f)}$ and $\varrho ^\mathcal {H}_{(G, f)}$, computed for some networks and some filtering functions. The vertices identified by the so selected cornerpoints will be called persistent hubs, in particular persistent steady hubs or persistent ranging hubs.

3.2 Airports

A first attempt of the search for relevant hubs has been realized on a set of 44 major North-American cities (41 in the US, three in Canada; the ones in capital letters in the Amtrak railway map; see Table 1). The edges connect cities between which there have been flights in a randomly chosen but fixed week (June 11–17, 2018). Flight data have been obtained from Google Flights by selecting direct flights with Business Class; distances have been found at Prokerala.com. A single vertex has been considered for each city with more than one airport.

Table 1 The towns considered as vertices and the respective degrees in the graph

Full size table

As filtering functions we used:

Distance
Number of flights in the fixed week
Their product

and their opposites (+their maximum). For each such choice we looked for steady and ranging hubs, for a total of twelve different persistence diagrams. Note that the same vertex can contribute to several cornerpoints of the persistence diagram of $\sigma ^\mathcal {H}_{(G, f)}$, whereas this cannot happen for $\varrho ^\mathcal {H}_{(G, f)}$.

Next, we report results in which where the interest resides in the identification of hubs which do not rank very high by their degree. In particular, we do not find of particular interest that Atlanta, Dallas, Chicago and Houston turn out to be often persistent ranging or steady hubs, since they have the highest degrees in the graph (42, 41, 40 and 40 respectively).

The first occurrence of a persistent hub which is rather far from having highest degrees is with the filtering function distance: Seattle is just twelfth in the degree rank, but appears above the widest diagonal gap as a steady hub (Fig. 7). Persistent steady hubs are: Atlanta (with two cornerpoints), Dallas, Seattle.

Surprisingly, if we use the opposite of distance (summed to the maximum distance, for ease of representation), the cornerpoints corresponding to vertices with highest degrees are located under the widest diagonal gap (Fig. 8). Persistent steady hubs are: Los Angeles, San Francisco, Seattle.

New York City has the eighth highest degree (35, together with Detroit, Phoenix and San Francisco). Still, we would expect it to appear as a hub, in the common sense of the term. In fact, it occurs as one of the few ranging hubs when the filtering functions (max minus number of flights) and distance$\cdot $(max minus number of flights) are used.

Ranging hubs for (max minus number of flights): Atlanta, Chicago, Dallas, New York.

Ranging hubs for the product filtering function are Atlanta, Chicago, Dallas, New York, Vancouver.

3.3 Characters co-occurrence in a novel

A classical benchmark for the analysis of hubs in co-occurrence graphs is given by Les Misérables. The network representing the co-occurrence of its characters is freely available at Graphistry. The graph has 77 major characters as vertices; each of the 254 edges joins two characters which appear together in at least one scene; the weight on an edge is the number of common occurrences. We used the inverse of the weight as a filtering function. We compare our results with the ones of Rieck et al. (2018), where the notion of clique-community centrality was used to spot particularly important characters: Table 2.

Table 2 Hubs in Les Misérables characters co-occurrence

Full size table

Our method spots Cosette as a hub, whereas clique-community centrality does not. On the contrary, our technique misses Gavroche and Fantine. Both methods miss Javert. We are particularly puzzled by the result of Kurlin’s selection method: above the second widest diagonal gap (the first obviously isolates Jean Valjean) we find only Enjolras.

3.4 Time-varying hubs

Weighted graphs can represent discrete dynamics in time-varying process. It is possible to keep track of persistence hubs obtaining a concise representation of the relative importance of each hub in time. We considered the characters co-occurrence in five subsequent books of the Game of Thrones saga, and applied the algorithm mentioned above for the analysis of character co-occurrence in Les Misérables. In this case, however, characters evolve throughout the books. A global analysis, i.e., computing hubs on the graphs obtained considering summary statistics on the five book hardly carries dynamical information. On the contrary, persistence hubs yield an easily visualizable summary of the characters’ roles in time. See Fig. 9.

3.5 Languages

The website TerraLing.com contains much information, consisting of 165 properties, about several languages. It was used in an interesting research (Port et al. 2018) on persistent cycles in language families. Unfortunately the amount of information varies quite a lot from language to language. We analysed the mutual relations of 19 languages (18 of the European Union plus Turkish: Table 3) for which at least 50% of the 165 properties are checked. The graph is the complete one with 19 vertices. The filtering function defined on each edge is the opposite of the normalised quantity of common properties of the two languages that it connects. Ranging and steady hubs coincide and are: Castilian, Catalan, Dutch, English, Portuguese, Swedish.

Table 3 The 19 considered languages

Full size table

Apart from the presence of English, which might also be biased by the great quantity of information available, we have no key for interpreting these results. For this and for the previous applications, we would very much like to set up a research with specific experts.

4 Digraph persistence

In this section, let (G, f), with $G=(V, A)$, be any weighted digraph. Given a feature $\mathcal {F}: 2^{V\cup A} \rightarrow \{true, false\}$, it is straightforward to extend the definitions of balanced ip-function (Definition 6), of natural pseudodistance (Definition 7), the stability theorem (Theorem 1) and the definitions of steady and ranging sets (Definition 10) and of the ip-function generators $\sigma ^\mathcal {F}$ and $\varrho ^\mathcal {F}$ (Definition 11, Proposition 2) to this setting.

We define $\mathcal{DH}\mathcal{}: 2^{V\cup A} \rightarrow \{true, false\}$ to yield true only on singletons containing a vertex whose outdegree is greater than the ones of its neighbours. Also in this case, there are many possible variations of this feature: we recover the notions of hub, steady hub and ranging hub and ip-function generators $\sigma ^{\mathcal{DH}\mathcal{}}$ and $\varrho ^{\mathcal{DH}\mathcal{}}$ as in Sect. 2.5.

Figure 10a presents all tournaments on three vertices, with injective functions with values in the set $\{1, 2, 3\}$. Figure 10b shows the values of some ip-functions. The correspondence between weighted tournaments and functions is given in Table 4. On these digraphs, $\sigma ^{\mathcal{DH}\mathcal{}}$ and $\varrho ^{\mathcal{DH}\mathcal{}}$ yield coinciding functions. However, this is not always the case, as shown in Fig. 11.

Table 4 The correspondence between the weighted digraphs of Fig. 10a and the diagrams of Fig. 10b for feature ${\mathcal{DH}\mathcal{}}$

Full size table

There are two opposite definitions of a kernel of a digraph; we shall consider the one given in Morgenstern and Neumann (1953). However, alternative definitions (see, e.g., Galeana-Sánchez and Hernández-Cruz (2014)) give also rise to admissible features in our framework. We define the feature $\mathcal {K}: 2^{V \cup A} \rightarrow \{true, false\}$ to yield true only on kernels, i.e. independent sets X of vertices such for every vertex $w \in V-X$, there exists at least one arc $a \in A$ with w as tail and head in X, where independence is defined with respect to the underlying undirected graph. Then $\sigma ^\mathcal {K}$ and $\varrho ^\mathcal {K}$ are ip-function generators. The correspondence between weighted tournaments and functions is given in Table 5.

Table 5 The correspondence between the weighted digraphs of Fig. 10a and the diagrams of Fig. 10b for feature $\mathcal {K}$

Full size table

None of the ip-function generators $\sigma ^{\mathcal{DH}\mathcal{}}$, $\varrho ^{\mathcal{DH}\mathcal{}}$, $\sigma ^\mathcal {K}$, $\varrho ^\mathcal {K}$ is balanced (see the Appendix).

5 Conclusions

We introduced ip-functions in a fairly general setting and studied their stability. We have then restricted our scope to the categories of graphs and digraphs, where we have defined steady and ranging sets according to features relative to the given (di)graphs.

We showed how graph-theoretical features can be used directly to obtain a concise representation of weighted undirected and directed graphs as persistence diagrams. In particular, we believe that the steady and ranging ip-function generators allow for a more streamlined analysis of graphs and networks bypassing the construction of auxiliary simplicial complexes. Although the steady and ranging sets yield equivalent results in some cases, persistence diagrams associated with ranging sets are generally simpler than the ones derived from steady sets, so the information is represented in a more condensed way. This is not the only reason for considering both representations. In our applications, we focused on the notion of hub. There, we showcased how the ranging representation of hubs is relevant for hub detection: a vertex might be relevant for the global dynamics of a network if it has local degree prevalence at far enough levels. For example, in a graph whose vertices represent users of a social network, edges represent “friendship”, and weights represent geographical distance, we conjecture that high-persistence ranging hubs might be crucial for the diffusion of “viral” documents. Analogously, we thought that an airport might have a key role if it has a sort of centrality both at a regional and international level, but not necessarily at all intermediate ones.

References

Alon, N., Shapira, A.: Every monotone graph property is testable. SIAM J. Comput. 38(2), 505–522 (2008)
Article MathSciNet MATH Google Scholar
Anand, D.V., Meng, Z., Xia, K., Mu, Y.: Weighted persistent homology for osmolyte molecular aggregation and hydrogen-bonding network analysis. Sci. Rep. 10(1), 1–17 (2020)
Article Google Scholar
Banerjee, S., Bhamidi, S.: Persistence of hubs in growing random networks. Probab. Theory Relat. Fields 180, 891–953 (2021)
Article MathSciNet MATH Google Scholar
Bergomi, M.G., Ferri, M., Vertechi, P., Zuffi, L.: Beyond topological persistence: starting from networks. Mathematics (2021). https://doi.org/10.3390/math9233079
Article Google Scholar
Bergomi, M.G., Ferri, M., Zuffi, L.: Topological graph persistence. Commun. Appl. Ind. Math. 11(1), 72–87 (2020). https://doi.org/10.2478/caim-2020-0005
Article MathSciNet MATH Google Scholar
Bergomi, M.G., Vertechi, P.: Rank-based persistence. Theory Appl. Categ. 35(9), 228–260 (2020)
MathSciNet MATH Google Scholar
Blevins, A.S., Bassett, D.S.: Reorderability of node-filtered order complexes. Phys. Rev. E 101(5), 052311 (2020)
Article MathSciNet Google Scholar
Bubenik, P., Scott, J.A.: Categorification of persistent homology. Discret. Comput. Geom. 51(3), 600–627 (2014)
Article MathSciNet MATH Google Scholar
Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L.J., Oudot, S.Y.: Proximity of persistence modules and their diagrams. In: SCG ’09: Proceedings of the 25th Annual Symposium on Computational Geometry, pp. 237–246. ACM, New York (2009). https://doi.org/10.1145/1542362.1542407
Chowdhury, S., Mémoli, F.: Persistent path homology of directed networks. In: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1152–1169. SIAM (2018)
Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discr. Comput. Geom. 37(1), 103–120 (2007). https://doi.org/10.1007/s00454-006-1276-5
Article MathSciNet MATH Google Scholar
d’Amico, M., Frosini, P., Landi, C.: Natural pseudo-distance and optimal matching between reduced size functions. Acta Appl. Math. 109(2), 527–554 (2010)
Article MathSciNet MATH Google Scholar
Dereich, S., Mörters, P.: Random networks with sublinear preferential attachment: degree evolutions. Electron. J. Probab. 14, 1222–1267 (2009)
Article MathSciNet MATH Google Scholar
Edelsbrunner, H., Harer, J.: Persistent homology–a survey. Contemp. Math. 453, 257–282 (2008)
Article MathSciNet MATH Google Scholar
Frosini, P., Landi, C., Mémoli, F.: The persistent homotopy type distance. Homol. Homot. Appl. 21(2), 231–259 (2019). https://doi.org/10.4310/HHA.2019.v21.n2.a13
Article MathSciNet MATH Google Scholar
Frosini, P., Mulazzani, M.: Size homotopy groups for computation of natural size distances. Bull. Belg. Math. Soc. 6(3), 455–464 (1999)
MathSciNet MATH Google Scholar
Galashin, P.: Existence of a persistent hub in the convex preferential attachment model. Probab. Math. Stat. 36(1), 59–74 (2016)
MathSciNet MATH Google Scholar
Galeana-Sánchez, H., Hernández-Cruz, C.: On the existence of (k, l)-kernels in infinite digraphs: a survey. Discuss. Math. Graph Theory 34(3), 431–466 (2014)
Article MathSciNet MATH Google Scholar
Govc, D., Levi, R., Smith, J.P.: Complexes of tournaments, directionality filtrations and persistent homology. Journal of Applied and Computational Topology 5(2), 313–337 (2021)
Kim, W., Mémoli, F.: Generalized persistence diagrams for persistence modules over posets. J. Appl. Comput. Topol. 5(4), 533–581 (2021)
Article MathSciNet MATH Google Scholar
Kurlin, V.: A fast persistence-based segmentation of noisy 2d clouds with provable guarantees. Pattern Recognit. Lett. 83, 3–12 (2016)
Article Google Scholar
Lesnick, M.: The theory of the interleaving distance on multidimensional persistence modules. Found. Comput. Math. (2015). https://doi.org/10.1007/s10208-015-9255-y
Article MathSciNet MATH Google Scholar
Lord, L.D., Expert, P., Fernandes, H.M., Petri, G., Van Hartevelt, T.J., Vaccarino, F., Deco, G., Turkheimer, F., Kringelbach, M.L.: Insights into brain architectures from the homological scaffolds of functional connectivity networks. Front. Syst. Neurosci. 10, 85 (2016)
Article Google Scholar
McCleary, A., Patel, A.: Bottleneck stability for generalized persistence diagrams. Proc. Am. Math. Soc. 148, 3149–3161 (2020). https://doi.org/10.1090/proc/14929
Article MathSciNet MATH Google Scholar
McCleary, A., Patel, A.: Edit distance and persistence diagrams over lattices. SIAM J. Appl. Algebra Geom. 6(2), 134–155 (2022). https://doi.org/10.1137/20M1373700
Article MathSciNet MATH Google Scholar
Morgenstern, O., Von Neumann, J.: Theory of Games and Economic Behavior. Princeton University Press, Princeton (1953)
MATH Google Scholar
Oudot, S.Y.: Persistence Theory: From Quiver Representations to Data Analysis, vol. 209. American Mathematical Society Providence, Providence (2015)
Book MATH Google Scholar
Patel, A.: Generalized persistence diagrams. J. Appl. Comput. Topol. 1(3–4), 397–419 (2018)
Article MathSciNet MATH Google Scholar
Petri, G., Expert, P., Turkheimer, F., Carhart-Harris, R., Nutt, D., Hellyer, P.J., Vaccarino, F.: Homological scaffolds of brain functional networks. J. R. Soc. Interface 11(101), 20140873 (2014)
Article Google Scholar
Port, A., Gheorghita, I., Guth, D., Clark, J.M., Liang, C., Dasu, S., Marcolli, M.: Persistent topology of syntax. Math. Comput. Sci. 12(1), 33–50 (2018)
Article MathSciNet MATH Google Scholar
Reimann, M.W., Nolte, M., Scolamiero, M., Turner, K., Perin, R., Chindemi, G., Dłotko, P., Levi, R., Hess, K., Markram, H.: Cliques of neurons bound into cavities provide a missing link between structure and function. Front. Comput. Neurosci. 11, 48 (2017)
Article Google Scholar
Rieck, B., Fugacci, U., Lukasczyk, J., Leitte, H.: Clique community persistence: a topological visual analysis approach for complex networks. IEEE Trans. Vis. Comput. Gr. 24(1), 822–831 (2018)
Article Google Scholar
de Silva, V., Munch, E., Stefanou, A.: Theory of interleavings on categories with a flow. Theory Appl. Categ. 33(1), 583–607 (2018)
MathSciNet MATH Google Scholar
Sizemore, A.E., Giusti, C., Kahn, A., Vettel, J.M., Betzel, R.F., Bassett, D.S.: Cliques and cavities in the human connectome. J. Comput. Neurosci. 44(1), 115–145 (2018). https://doi.org/10.1007/s10827-017-0672-6
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We are indebted to Diego Alberici, Emanuele Mingione, Pierluigi Contucci, Patrizio Frosini, Lorenzo Zuffi and above all Pietro Vertechi for many fruitful discussions. Article written within the activity of INdAM-GNSAGA. We thank the reviewers for the very helpful comments and suggestions. On behalf of all authors, the corresponding author states that there is no conflict of interest.

Funding

Open access funding provided by Alma Mater Studiorum - Universitá di Bologna within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Milan, Italy
Mattia G. Bergomi
Department of Mathematics, ARCES, University of Bologna, Bologna, Italy
Massimo Ferri & Antonella Tavaglione

Authors

Mattia G. Bergomi
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Ferri
View author publications
You can also search for this author in PubMed Google Scholar
Antonella Tavaglione
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Massimo Ferri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Unbalanced

In order to show that some of the proposed ip-functions are not balanced—so their persistence diagrams do not generally enjoy stability—we give examples which do not respect Definition 6.

The ip-function generator $\sigma ^\mathcal{EU}\mathcal{}$ is not balanced, as the example of Fig. 12 shows: in fact, the maximum absolute value of the weight difference on the same edges is 1, and $\sigma ^\mathcal{EU}\mathcal{}_{(G, f)}(4.5-1, 10+1) = 1 > 0 = \sigma ^\mathcal{EU}\mathcal{}_{(G, g)}(4.5, 10)$.

Also the ip-function generator $\varrho ^\mathcal{EU}\mathcal{}$ is not balanced, as the example of Fig. 13 shows: in fact, the maximum absolute value of the weight difference on the same edges is 1, and $\varrho ^\mathcal{EU}\mathcal{}_{(G, f)}(7.5-1, 10+1) = 1 > 0 = \varrho ^\mathcal{EU}\mathcal{}_{(G,g)}(7.5, 10)$.

The ip-function generator $\sigma ^{m\mathcal {I}}$ is not balanced: for the two filtering functions on the graph of Fig. 14 the maximum difference in absolute value on the same edges is 1, but $\varrho ^{m\mathcal {I}}_{(G, f)}(3.5-1, 6+1) = 1 > 0 = \varrho ^{m\mathcal {I}}_{(G,g)}(3.5, 6)$.

The ip-function generator $\varrho ^{m\mathcal {I}}$ is not balanced: for the two filtering functions on the graph of Fig. 15 the maximum difference in absolute value on the same edges is 1, but $\varrho ^{m\mathcal {I}}_{(G, f)}(3.5-1, 5+1) = 3 > 2 = \varrho ^{m\mathcal {I}}_{(G,g)}(3.5, 5)$.

$\sigma ^\mathcal {H}$ is not a balanced ip-function generator, as the example of Fig. 16 shows: the maximum absolute value of the weight difference on the same edges is 2, but $ \sigma ^\mathcal {H}_{(G, f)}(4-2, 9+2) = 1 > 0 = \sigma ^\mathcal {H}_{(G, g)}(4,9)$.

There are counterexamples which are even simpler than this and the one of Fig. 17. These have the advantage to hold also if “>” is substituted by“$\ge $” in the definition of hub (what we don’t think to be a good idea).

Also $\varrho ^\mathcal {H}$ is not a balanced ip-function generator, as the example of Fig. 17 shows: the maximum absolute value of the weight difference on the same edges is 2, but $ \varrho ^\mathcal {H}_{(G, f)}(5-2, 6+2) = 1 > 0 = \varrho ^\mathcal {H}_{(G, g)}(5,6)$.

In order to show that $\sigma ^{\mathcal{DH}\mathcal{}}$ and $\varrho ^{\mathcal{DH}\mathcal{}}$ are not balanced, consider the weighted tournaments 010 as (G, f) and 011 as $(G', f')$. For the isomorphism $\psi $ which swaps vertices a and b, one has $|f(e)-f'\big (\psi (e)\big )| \le 1$ for all $e \in A$, but $\sigma ^{\mathcal{DH}\mathcal{}}_{(G, f)}(2.5-1, 3+1) = \varrho ^{\mathcal{DH}\mathcal{}}_{(G, f)}(2.5-1, 3+1) = 1 > 0 = \varrho ^{\mathcal{DH}\mathcal{}}_{(G', f')}(2.5, 3) = \sigma ^{\mathcal{DH}\mathcal{}}_{(G', f')}(2.5, 3)$.

The ip-function generator $\sigma ^\mathcal {K}$ is not balanced: consider the weighted tournaments 010 as (G, f) and 011 as $(G', f')$. For the isomorphism $\psi $ which swaps vertices a and b, one has $|f(e)-f'\big (\psi (e)\big )| \le 1$ for all $e \in A$, but $\sigma ^{\mathcal {K}}_{(G, f)}(2.5-1, 4+1) = 1 > 0 = \sigma ^{\mathcal {K}}_{(G', f')}(2.5, 4)$.

Finally, also $\varrho ^\mathcal {K}$ is not a balanced ip-function generator: consider the weighted tournaments 001 as (G, f) and 101 as $(G', f')$. For the isomorphism $\psi $ which swaps vertices a and c, one has $|f(e)-f'\big (\psi (e)\big )| \le 1$ for all $e \in A$, but $\varrho ^{\mathcal {K}}_{(G, f)}(2.5-1, 4+1) = 1 > 0 = \varrho ^{\mathcal {K}}_{(G', f')}(2.5, 4)$.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bergomi, M.G., Ferri, M. & Tavaglione, A. Steady and ranging sets in graph persistence. J Appl. and Comput. Topology 7, 33–56 (2023). https://doi.org/10.1007/s41468-022-00099-1

Download citation

Received: 27 September 2020
Revised: 13 July 2022
Accepted: 09 August 2022
Published: 23 September 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s41468-022-00099-1

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Steady and ranging sets in graph persistence

Abstract

Similar content being viewed by others

Hierarchies and Ranks for Persistence Pairs

Persistent Homology over Directed Acyclic Graphs

Hochschild homology, and a persistent approach via connectivity digraphs

1 Introduction

1.1 Persistence diagrams

Definition 1

Definition 2

2 Graph-theoretical persistence

Definition 3

Definition 4

Remark 1

Definition 5

Remark 2

Remark 3

2.1 Balanced ip-functions

Definition 6

Definition 7

Theorem 1

Proposition 1

Proof

Remark 4

2.2 Steady and ranging sets

Definition 8

Definition 9

Definition 10

Remark 5

Lemma 1

Proof

Definition 11

Proposition 2

Proof

2.3 Steady and ranging persistence on Eulerian sets

Definition 12

2.4 Monotone features

Definition 13

Lemma 2

Proof

Proposition 3

Proof

Lemma 3

Proof

Proposition 4

Proof

Proposition 5

Proof

2.5 Hubs

Definition 14

3 Persistent hubs

3.1 A selection procedure

3.2 Airports

3.3 Characters co-occurrence in a novel

3.4 Time-varying hubs

3.5 Languages

4 Digraph persistence

5 Conclusions

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Unbalanced

Appendix: Unbalanced

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation