Quantifying the Structural Stability of Simplicial Homology

Simplicial complexes are generalizations of classical graphs. Their homology groups are widely used to characterize the structure and the topology of data in e.g. chemistry, neuroscience, and transportation networks. In this work we assume we are given a simplicial complex and that we can act on its underlying graph, formed by the set of 1-simplices, and we investigate the stability of its homology with respect to perturbations of the edges in such graph. Precisely, exploiting the isomorphism between the homology groups and the higher-order Laplacian operators, we propose a numerical method to compute the smallest graph perturbation sufficient to change the dimension of the simplex’s Hodge homology. Our approach is based on a matrix nearness problem formulated as a matrix differential equation, which requires an appropriate weighting and normalizing procedure for the boundary operators acting on the Hodge algebra’s homology groups. We develop a bilevel optimization procedure suitable for the formulated matrix nearness problem and illustrate the method’s performance on a variety of synthetic quasi-triangulation datasets and real-world transportation networks.

1. Introduction.Models based on graphs are ubiquitous in the sciences and engineering as they allow us to model various complex systems in a unified form.Despite being widely and successfully used, graph-based models are limited to pairwise relationships and a variety of recent work has shown that many complex systems and datasets are better described by higher-order relations that go beyond pairwise interactions [5,7,9].Relational data is full of interactions that happen in groups.For example, friendship relations often involve groups that are larger than two individuals and triangles are important building blocks of relational data [1,4,24].Also in the presence of point-cloud data, directly modeling higherorder data interactions has led to improvements in numerous data mining settings, including clustering [7,16,34], link prediction [4,8], and ranking [6,33,32].
A popular extension of dyadic models to the higher-order setting uses hypergraphs and relies on the transition from matrix-based to tensor-based models [5,9].Based on recent generalizations of graph spectral theory to tensor eigenproblems [28,17], a number of graph methods based on matrices for e.g.clustering, community detection, centrality, opinion dynamics, have been extended to higher-order models based on tensors [4,34,6,25,26].
Alongside hypergraphs and tensors, simplicial complexes and higher-order Laplacians are another standard model for higher-order interactions, where simplices of different order can connect an arbitrary number of nodes [5,9].Higher-order Laplacians are linear operators that generalize the better-known graph Laplacian and provide key algebraic tools that allow to describe a simplicial complex and its structural properties.In particular, their kernels define a homology of the data and reveal fundamental topological properties such as connected components, holes, and voids [22,11].
In this work we are interested to quantifying the stability of such topological properties, with respect to edge perturbations.More precisely, given an initial simplicial complex K with a corresponding homology and an underlying graph G K formed by the 1-simplices of K, we want to quantify how far is K from another simplicial complex with a homology group of strictly different dimension.Here we assume we can act on the edges of G K and "being far" is measured in terms of the least number (or, in the weighted case, the least weight) of edges of G K that can be eliminated to change the dimension of the chosen homology group.While this form of stability is reminiscent of the persistence of the homology, which is widely studied in the literature, we remark that the two are significantly different.Rather than starting with a dataset of points on which to build a chain of simplices and an associated persistency diagram, in our setting, we assume we are given an initial simplex as a result of a data assimilation and modeling process we have no access to.Thus, acting on the elementary pairwise information we are given, the 1-simplices, we want to quantify the robustness of the homology that characterizes the simplex at hand.While a great effort has been devoted in recent years to measure the presence and persistence of simplicial homology [13,27], much less is available about the stability of the homology classes with respect to data perturbation in the non-regular (non-geometric) setting [10].
The resulting problem can be formulated as a structured matrix nearness problem, which we approach by means of a spectral objective function and the integration of the corresponding matrix-valued gradient flow.In order to make a sound mathematical formulation of the problem we aim to solve, and of the numerical model we design for its solution, we structure the remainder of the paper as follows: in Section 2 we review in detail the notion of simplicial complexes and the corresponding higher-order Laplacians.In Subsection 2.2 and Subsection 3.1 we discuss how these operators may be extended to account for weighted higher-order node relations and formulate the corresponding stability problem in Section 3 and Section 4. After introducing a suitable spectral functional whose minimization to zero mathematically translates the stability problem, in Section 5 and Section 6 we present a two-level methodology whose inner iteration consists of a constrained matrix gradient flow on which is based our numerical method, and whose outer level tunes a suitable scalar parameter in order to solve a related scalar equation.Finally, we devote Section 7 to illustrate the performance of the proposed numerical scheme on several example datasets.

Simplicial complexes and higher-order relations.
A graph G is a pair of sets (V, E), where V = {1, . . ., n} is the set of vertices and E ⊂ V ×V is a set of unordered pairs representing the undirected edges of G.We let m denote the number of edges E = {e 1 , . . ., e m } and we assume them ordered lexicographically, with the convention that i < j for any {i, j} ∈ E. For brevity, we often write ij in place of {i, j} to denote the edge joining i and j.Moreover, we assume no self-loops, i.e., ii / ∈ E for all i ∈ V.A graph only considers pairwise relations between the vertices.A simplicial complex K is a generalization of a graph that allows us to model connections involving an arbitrary number of nodes by means of higher-order simplices.Formally, a k-th order simplex (or k-simplex, briefly) is a set of k + 1 vertices {i 0 , i 1 , . . ., i k } with the property that every subset of k nodes itself is a (k − 1)-simplex.Any (k − 1)-simplex of a k-simplex is called a face.The collection of all such simplices forms a simplicial complex K, which therefore essentially consists of a collection of sets of vertices such that every subset of the set in the collection is in the collection itself.Thus, a graph G can be thought of as the collection of 0-and 1-simplices: the 0-simplices form the nodes set of G, while 1-simplices form its edges.To emphasize this analogy, in the sequel we often specify that K is a simplicial complex on the vertex set V.
Just like the edges of a graph, to any simplicial complex, we can associate an orientation (or ordering).To underline that an ordering has been fixed, we denote an ordered k-simplex σ using square brackets σ = [i 0 . . .i k ].In particular, as for the case of edges, we always assume the lexicographical ordering, unless specified otherwise.That is, we assume that: if and only if there exists h such that 0 ≤ h ≤ k, i j = i j for j = 0, . . ., h and i h < i h .As for the edges, we often write i 0 . . .i k in place of [i 0 . . .i k ] in this case.
2.1.Homology, boundary operators and higher-order Laplacians.Topological properties of a simplicial complex can be studied by considering boundary operators, higher-order Laplacians, and the associated homology.Here we recall these concepts trying to emphasize their matrix-theoretic flavor.To this end, we first fix some further notation and recall the notion of a real k-chain.Definition 2.1.Assume K is a simplicial complex on the vertex set V. For k ≥ 0, we denote the set of all the oriented k-th order simplices in K as V k (K) or simply V k .Thus, V 0 = V and V 1 = E form the underlying graph of K, which we denote by Definition 2.2.The formal real vector space spanned by all the elements of V k with real coefficients is denoted by C k (K).Any element of C k (K), the formal linear combinations of simplices in V k , is called a k-chain.
We remark that, in the graph-theoretic terminology, C 0 (K) is usually called the space of vertices' states, while C 1 (K) is usually called the space of flows in the graph.
The chain spaces are finite vector spaces generated by the set of k-simplices.The boundary and co-boundary operators are particular linear mappings between C k and C k−1 , which in a way are the discrete analogous of high-order differential operators (and their adjoints) on continuous manifolds.The boundary operator ∂ k maps a k-simplex to an alternating sum of its (k − 1)-dimensional faces obtained by omitting one vertex at a time.Its precise definition is recalled below, while Figure 1 provides an illustrative example of its action.
to the following alternated sum of its faces:  As we assume the k-simplices in V k are ordered lexicographically, we can fix a canonical basis for C k (K) and we can represent ∂ k as a matrix B k with respect to such basis.In fact, once the ordering is fixed, C k (K) is isomorphic to R V k , the space of functions from V k to R or, equivalently, the space of real vectors with We shall always assume the canonical basis for C k (K) is fixed in this way and we will deal exclusively with the matrix representation B k from now on.An example of B k for k = 1 and k = 2 is shown in Figure 1.
A direct computation shows that the following fundamental identity holds (see e.g.[22, Thm.5.7]) (2.1) for any k.This identity is also known in the continuous case as the Fundamental Lemma of Homology and it allows us to define a homology group associated with each k-chain.In fact, Equation (2.1) implies in particular that im B k+1 ⊂ ker B k , so that the k-th homology group is correctly defined: The dimensionality of the k-th homology group is known as k-th Betti's number β k = dim H k , while the elements of H k correspond to so-called k-dimensional holes in the simplicial complex.For example, H 0 , H 1 and H 2 describe connected components, holes and threedimensional voids respectively.By standard algebraic passages one sees that H k is isomorphic to ker B k B k + B k+1 B k+1 .Thus, the number of k-dimensional holes corresponds to the dimension of the kernel of a linear operator, which is known as k-th order Laplacian or higher-order Laplacian of the simplicial complex K. Definition 2.4.Given a simplicial complex K and the boundary operators B k and B k+1 , the k-th order Laplacian L k of K is the |V k | × |V k | matrix defined as: Continuous and analogous discrete manifolds with one 1-dimensional hole (dim H1 = 1).Left pane: the continuous manifold; center pane: the discretization with mesh vertices; right pane: a simplicial complex built upon the mesh.Triangles in the simplicial complex K are colored gray (right).
In particular, we remark that: • the 0-order Laplacian is the standard combinatorial graph Laplacian whose diagonal entries consist of the degrees of the corresponding vertices (i.e. the number of 1-simplices each vertex belongs to), while the off-diagonal (L 0 ) ij is equal to −1 if either ij is a 1-simplex, and it is zero otherwise; • the 1-order Laplacian is known as Hodge Laplacian to the 0-order case L 0 , one can describe the entries of L 1 in terms of the structure of the simplicial complex, see e.g.[23].
2.1.1.Connected components and holes.The boundary operators B k on K are directly connected with discrete notions of differential operators on the graph.In particular, B 1 , B 1 , and B 2 are the graph's divergence, gradient, and curl operators, respectively.We refer to [22] for more details.As H k is isomorphic to ker L k , we have that the following Hodge Thus, the space of vertex states R V 0 can be decomposed as R V 0 = im B 1 ⊕ ker L 0 , the sum of divergence-free vectors and harmonic vectors, which correspond to the connected components of the graph.In particular, for a connected graph, ker L 0 is one-dimensional and consists of entry-wise constant vectors.Thus, for a connected graph, im B 1 is the set of vectors whose entries sum up to zero.Similarly, the space of flows on graph's edges R V 1 can be decomposed as Thus, each flow can be decomposed into its gradient part im B 1 , which consists of flows with zero cycle sum, its curl part im B 2 , which consists of circulations around order-2 simplices in K, and its harmonic part ker L 1 , which represents 1-dimensional holes defined as global circulations modulo the curl flows.
While 0-dimensional holes are easily understood as the connected components of the graph G K , a notion of "holes in the graph" G K corresponds to 1-dimensional holes in K.This terminology comes from the analogy with the continuous case.In fact, if the graph is obtained as a discretization of a continuous manifold, harmonic functions in the homology group H 1 would correspond to the holes in the manifold, as illustrated in Figure 2.Moreover, the Hodge Laplacian of a simplicial complex built on N randomly sampled points in the manifold converges in the thermodynamic limit to its continuous counterpart, as N → ∞, [12].
2.2.Normalized and weighted higher-order Laplacians.In the classical graph setting, a normalized and weighted version of the Laplacian matrix is very often employed in applications.From a matrix theoretic point of view, having a weighted graph corresponds to allowing arbitrary nonnegative entries in the adjacency matrix defining the graph.In terms of boundary operators, this coincides with a positive diagonal rescaling.Analogously, the normalized Laplacian is defined by applying a diagonal congruence transformation to the standard Laplacian using the node weights.We briefly review these two constructions below.
Let G = (V, E) be a graph with positive node and edge weight functions w 0 : V → R ++ and w 1 : E → R ++ , respectively.Define the |V| × |V| and |E| × |E| diagonal matrices is the normalized and weighted boundary operator of G and we have that (2.4) is the normalized weighted graph Laplacian of G.
In particular, note that, as for L 0 , the entries of L 0 uniquely characterize the graph topology, in fact we have where deg(i) denotes the (weighted) degree of node i, i.e. deg(i) = e∈E:i∈e w 1 (e).
While the definition of k-th order Laplacian is well-established for the case of unweighted edges and simplices, a notion of weighted and normalized k-th order Laplacian is not universally available and it might depend on the application one has at hand.For example, different definitions of weighted Hodge Laplacian are considered in [11,20,22,29].
At the same time, we notice that the notation used in Equation (2.4) directly generalizes to higher orders.Thus, we propose the following notion of normalized and weighted k-th Laplacian Definition 2.5.Let K be a simplicial complex and let w k : V k → R ++ be a positive-valued weight function on the k-simplices of K. Define the diagonal matrix is the normalized and weighted k-th boundary operator, to which corresponds the normalized and weighted k-th Laplacian 1), we immediately have that B k B k+1 = 0. Thus, the group H k = ker B k / im B k+1 is well defined for any choice of positive weights w k and is isomorphic to ker L k .While the homology group may depend on the weights, we observe below that its dimension does not.Precisely, we have Proposition 2.6.The dimension of the homology groups of K is not affected by the weights of its k-simplices.Precisely, if W k are positive diagonal matrices, we have Similarly, one observes that dim ker Thus, for the homology group it holds dim 2.3.Principal spectral inheritance.Before moving on to the next section, we recall here a relatively direct but important spectral property that connects the spectra of the k-th and (k + 1)-th order Laplacians.
In other words, the variation of the spectrum of the k-th Laplacian when moving from one order to the next one works as follows: the down-term L down k+1 inherits the positive part of the spectrum from the up-term of L up k ; the eigenvectors corresponding to the inherited positive part of the spectrum lie in the kernel of L up k+1 ; at the same time, the "new" up-term L up k+1 has a new, non-inherited, part of the positive spectrum (which, in turn, lies in the kernel of the (k + 2)-th down-term).
In particular, we notice that for k = 0, since B 0 = 0 and . In other terms, the positive spectrum of the L 0 is inherited by the spectrum of L 1 and the remaining (non-inherited) part of σ + (L 1 ) coincides with σ + (L up 1 ). Figure 3 provides an illustration of the statement of Theorem 2.7 for k = 0.
3. Problem setting: Nearest complex with different homology.Suppose we are given a simplicial complex K on the vertex set V, with simplex weight functions w 0 , w 1 , . . ., and let β k = dim H k = dim H k the dimension of its k-homology.We aim at finding the closest simplex on the same vertex set V, with a strictly larger number of holes.As it is the most frequent higher-order Laplacian appearing in applications and since this provides already a large number of numerical complications, we focus here only on the Hodge Laplacian case: given the simplicial complex K = (V 0 , V 1 , V 2 , . . .), we look for the smallest perturbation of the edges V 1 which increases the dimension of H 1 .More precisely: Problem 3.1.Let K be a simplicial complex of order at least 2 with associated edge weight function w 1 and corresponding diagonal weight matrix W 1 , and let β 1 (W 1 ) be the dimension of the homology group corresponding to the weights in W 1 .For ε > 0, let In other words, Ω(ε) is an ε-sphere and Π(W 1 ) allows only non-negative simplex weights.We look for the smallest perturbation ε such that there exists a weight modification δW 1 ∈ Ω(ε) ∩ Π(W 1 ) such that β 1 (W 1 ) < β 1 (W 1 + δW 1 ).
Here, and throughout the paper, X denotes either the Frobenius norm if X is a matrix, or the Euclidean norm if X is a vector.Note that, as we are looking for the smallest ε, the equality W = ε is an obvious choice, as opposed to W ≤ ε.
As the dimension β 1 coincides with the dimension of the kernel of L 1 , we approach this problem through the minimization of a functional based on the spectrum of the 0-th and 1-st Laplacian of the simplicial complex.In order to define such functional, we first make a number of considerations.
Note that, due to Proposition 2.6, the dimension of the first homology group does not change when the edge weights are perturbed, as long as all the weights remain positive.Thus, in order to find the desired perturbation δW 1 , we need to set some of the initial weights to zero.This operation creates several potential issues we need to address, as discussed next.
First, setting an edge to zero implies that one is formally removing that edge from the simplicial complex.As the simplicial complex structure needs to be maintained, when doing so we need to set to zero also the weight of any 2-simplex that contains that edge.For this reason, if w 1 (e) = w 1 (e) + δw 1 (e) is the new edge weight function, we require the weight function of the 2-simplices to change into w 2 , defined as where f (u 1 , u 2 , u 3 ) is a function such that f (0, 0, 0) = 1 and that monotonically decreases to zero as u i → −1, for any i = 1, 2, 3.An example of such f is Second, when setting the weight of an edge to zero we may disconnect the underlying graph and create an all-zero column and row in the Hodge Laplacian.This gives rise to the phenomenon that we call "homological pollution", which we will discuss in detail in the next subsection.
3.1.Homological pollution: inherited almost disconnectedness.As the dimension of Hodge homology β 1 corresponds to the number of zero eigenvalues in L 1 , the intuition suggests that if L 1 has some eigenvalue that is close to zero, then the simplicial complex is "close to" having at least one more 1-dimensional hole.There are a number of problems with this intuitive consideration.
By Theorem 2.7 for k = 0, σ + (L 1 ) inherits σ + (L 0 ).Hence, if the weights in W 1 vary continuously so that a positive eigenvalue in σ + (L 0 ) approaches 0, the same happens to σ + (L 1 ).Assuming the initial graph G K is connected, an eigenvalue that approaches zero in σ(L 0 ) would imply that G K is approaching disconnectedness.This leads to a sort of pollution of the kernel of L 1 as an almost-zero eigenvalue which corresponds to an "almost" 0-dimensional hole (disconnected component) from L 0 is inherited into the spectrum of L 1 , but this small eigenvalue of L 1 does not correspond to the creation of a new 1-dimensional hole in the reduced complex.
To better explain the problem of homological pollution, let us consider the following illustrative example.
To mitigate the phenomenon of homological pollution, in our spectral-based functional for Problem 3.1 we include a term that penalizes the spectrum of L 0 from approaching zero.To this end, we observe below that a careful choice of the vertex weights is required.
The smallest non-zero eigenvalue of the Laplacian µ 2 ∈ σ(L 0 ) is directly related to the connectedness of the graph G K .This relation is well-known and dates back to the pioneering work of Fiedler [14].In particular, as µ 2 is a function of node and edge weights, the following generalized version of the Cheeger inequality holds (see e.g.[31]) , where h(G K ) is the Cheeger constant of the graph G K , defined as , and w 0 (S) = i∈S w 0 (i).We immediately see from (3.2) that when the graph G K is disconnected, then h(G K ) = 0 and µ 2 = 0 as well.Vice-versa, if µ 2 goes to zero, then h(G K ) decreases to zero too.While this happens independently of w 0 and w 1 , if w 0 is a function of w 1 then it might fail to capture the presence of edges whose weight is decreasing and is about to disconnect the graph.
To see this, consider the example choice w 0 (i) = deg(i), the degree of node i in G K .Note that this is a very common choice in the graph literature, with several useful properties, including the fact that no other graph-dependent constant appears in the Cheeger inequality (3.2) other than µ 2 .For this weight choice, consider the case of a leaf node, a node i ∈ V 0 that has only one edge ij 0 ∈ V 1 connecting i to the rest of the (connected) graph G K via the node j 0 .If we set w 1 (ij 0 ) = ε and we let ε decrease to zero, the graph G K is approaching disconnectedness and we would expect h(G K ) and µ 2 to decrease as well.However, one easily verifies that both µ 2 and h(G K ) are constant with respect to ε in this case, as long as ε = 0.
In order to avoid such a discontinuity, in our weight perturbation strategy for the simplex K, if w 0 is a function of w 1 , we perturb it by a constant shift.Precisely, if w 0 is the initial vertex weight of K, we set w 0 (i) = w 0 (i) + , with > 0. So, for example, if w 0 = deg and the new edge weight function w 1 (e) = w 1 (e) + δw 1 (e) is formed after the addition of δW 1 , we set w 0 (i) = j [w 1 (ij) + δw 1 (ij)] + .
4. Spectral functional for 1-st homological stability.We are now in the position to formulate our proposed spectral-based functional, whose minimization leads to the desired small perturbation that changes the first homology of K.In the notation of Problem 3.1, we are interested in the smallest perturbation ε and the corresponding modification δW 1 ∈ Ω(ε) ∩ Π(W 1 ) that increases β 1 .
As δW 1 = ε, for convenience we indicate δW 1 = εE with E = 1 so E ∈ Ω(1)∩Π ε (W 1 ), where Π ε (W 1 ) = {W | εW ∈ Π(W 1 )}.For the sake of simplicity, we will omit the dependencies and write Ω and Π ε , when there is no danger of ambiguity.Finally, let us denote by β 1 (ε, E) the first Betti number corresponding to the simplicial complex perturbed via the edge modification εE.With this notation, we can reformulate Problem 3.1 as follows: Problem 4.1.Find the smallest ε > 0, such that there exists an admissible perturbation E ∈ Ω ∩ Π ε with an increased number of holes, i.e. min ε > 0 : where is the first Betti number of the original simplicial complex.
In order to approach Problem 4.1, we introduce a target functional F (ε, E), based on the spectrum of the 1-Laplacian L 1 (ε, E) and the 0-Laplacian L 0 (ε, E), where the dependence on ε and E is to emphasize the corresponding weight perturbation is of the form Our goal is to move a positive entry from σ + (L 1 (ε, E)) to the kernel.At the same time, assuming the initial graph G K is connected, one should avoid the inherited almost disconnectedness with small positive entries of σ + (L 0 (ε, E)).As, by Theorem 2.7 for k = 0, , the only eigenvalue of L 1 (ε, E) that can be continuously driven to 0 comes from L up 1 (ε, E).For this reason, let us denote the first non-zero eigenvalue of the up-Laplacian L up 1 (ε, E) by λ + (ε, E).The proposed target functional is defined as: where α and µ are positive parameters, and µ 2 (ε, E) is the first nonzero eigenvalue of L 0 (ε, E).
As recalled in Subsection 3.1, µ 2 (ε, E) is an algebraic measure of the connectedness of the perturbed graph, thus the whole second term in (4.2) "activates" when such algebraic connectedness falls below the threshold µ.By design, F (ε, E) is non-negative and is equal to 0 iff λ + (ε, E) reaches 0, increasing the dimension of H 1 .Using this functional, we recast the Problem 4.1 as When G K is connected, dim ker L 0 = 1 and, by the Theorem 2.7, dim ker so the first nonzero eigenvalue of L up 1 is the (n+β 1 )th.While (n + β 1 ) can be a large number in practice, we will discuss in Subsection 6.1 an efficient method that allows us to compute λ + (ε, E) without computing any of the previous (n + β 1 − 1) eigenvalues.

5.
A two-level optimization procedure.We approach problem (4.3) by the means of a two-level iterative method, which is based on the successive minimization of the target functional F (ε, E) and a subsequent tuning of the parameter ε.More precisely, we propose the following two-level scheme.
1. Inner level: for fixed ε > 0, solve the minimization problem by a constrained gradient flow which we formulate below, where we denote the computed minimizer by E(ε). 2. Outer level: given the function ε → E(ε), we wish to solve the equation: (5.1) and our goal is to compute the smallest solution ε * > 0 of (5.1).A similar procedure was used in the context of graph spectral nearness in [2,3] and in other matrix nearness problems [19].
We approach the inner level by means of the constrained gradient system where P Ω∩Πε is an orthogonal projector (wrt to Frobenius inner product) onto the admissible set Ω ∩ Π ε (where ε is fixed).Since the system integrates the anti-gradient, a minimizer (at least local) of F (ε, E) is obtained as t → ∞.
We devote the next two Subsection 5.1 and Subsection 5.2 to computing the projected gradient in (5.2).The idea is to express the derivative of F in terms of the derivative of the perturbation and this way identifying the free gradient, and then projecting it onto the constraints obtaining the constrained gradient.To this end, we first compute the free gradient and then we discuss how to deal with the projection onto the admissible set.
By construction, the resulting algorithm converges to a (local) minimum of F (ε, E).Although global convergence to the global optimum is not guaranteed, in a few simple experiments we observe the method reaches the expected global solution.
5.1.The free gradient.We compute here the free gradient of F with respect to E, given a fixed ε.In order to proceed, we need a few preliminary results.
The following is a standard perturbation result for eigenvalues; see e.g.[21], where we denote by X, Y = i,j x ij y ij = Tr(X Y ) the inner product in R n×n that induces the Frobenius norm X = X, X 1/2 .Theorem 5.1 (Derivative of simple eigenvalues).Consider a continuously differentiable path of square symmetric matrices A(t) for t in an open interval I. Let λ(t), t ∈ I, be a continuous path of simple eigenvalues of A(t).Let x(t) be the eigenvector associated to the eigenvalue λ(t) and assume x(t) = 1 for all t.Then λ is continuously differentiable on I with the derivative (denoted by a dot) Moreover, "continuously differentiable" can be replaced with "analytic" in the assumption and the conclusion.
Let us denote the perturbed weight matrix by W 1 (t) = W 1 + εE(t), and the corresponding W 0 (t) = W 0 ( W 1 (t)) and W 2 (t) = W 2 ( W 1 (t)), defined accordingly as discussed in Section 3. From now on we omit the time dependence for the perturbed matrices to simplify the notation.Since W 0 , W 1 and W 2 are necessarily diagonal, by the chain rule we have , where 1 is the vector of all ones, diag(v) is the diagonal matrix with diagonal entries the vector v, and J i 1 is the Jacobian matrix of the i-th weight matrix with respect to W 1 , which for any Lemma 5.2 (Derivative of L 0 ).For the simplicial complex K with the initial edges' weight matrix W 1 and fixed perturbation norm ε, let E(t) be a smooth path and W 0 , W 1 , W 2 be corresponding perturbed weight matrices.Then, Lemma 5.3 (Derivative of L up 1 ).For the simplicial complex K with the initial edges' weight matrix W 1 and fixed perturbation norm ε, let E(t) be a smooth path and W 0 , W 1 , W 2 be corresponding perturbed weight matrices.Then, Combining Theorem 5.1 with Lemma 5.2 and Lemma 5.3 we obtain the following expression for the free gradient of the functional.
Theorem 5.4 (The free gradient of F (ε, E)).Assume the initial weight matrices W 0 , W 1 and W 2 , as well as the parameters ε > 0, α > 0 and µ > 0, are given.Additionally assume that E(t) is a differentiable matrix-valued function such that the first non-zero eigenvalue λ + (ε, E) of L up 1 (ε, E) and the second smallest eigenvalue µ 2 (ε, E) of L 0 (ε, E) are simple.Let W 0 , W 1 , W 2 be corresponding perturbed weight matrices; then: where x + is a unit eigenvector of L up 1 corresponding to λ + , y 2 is a unit eigenvector of L 0 corresponding to µ 2 , and the operator diagvec(X) returns the main diagonal of X as a vector.
Proof.To derive the expression for the gradient ∇ E F , we exploit the chain rule for the time derivative: λ Then it is sufficient to apply the cyclic perturbation for the scalar products of Lemma 5.2 and Lemma 5.3 with x + x + and y 2 y 2 respectively.The final transition requires the formula: Remark 5.5.The derivation above assumes the simplicity of both µ 2 (ε, E) and λ + (ε, E).This assumption is not restrictive as simplicity for these extremal eigenvalues is a generic property.Indeed we observe simplicity in all our numerical tests.
5.2.The constrained gradient system and its stationary points.In this section we are deriving from the free gradient determined in Theorem 5.4 the constrained gradient of the considered functional, that is the projected gradient (with respect to the Frobenius inner product) onto the manifold Ω ∩ Π ε , which consists of perturbations E of unit norm which preserve the structure of W .
In order to obtain the constrained gradient system, we need to project the unconstrained gradient given by Theorem 5.4 onto the feasible set and also to normalize E to preserve its unit norm.Using the Karush-Kuhn-Tucker conditions on a time interval where the set of 0-weight edges remain unchanged, the projection is done via the mapping P + G(ε, E), where Further, in order to comply with the constraint E(t) 2 = 1, we must have Thus, we obtain the following constrained optimization problem for the admissible direction of the steepest descent Lemma 5.6 (Direction of steepest admissible descent).Let E, G ∈ R m×m with G given by (5.2), and E = 1.On a time interval where the set of 0-weight edges remains unchanged, the gradient system reads , where κ = ε, G(E(t)), P + E(t) Proof.We need to orthogonalize E • (t) with respect to E(t).To this end, we introduce a linear orthogonality correction, i.e. we set E Equation (5.6) suggests that the system goes "primarily" in the direction of the antigradient −G(ε, E), thus the functional is expected to decrease along it.Lemma 5.7 (Monotonicity).Let E(t) of unit Frobenius norm satisfy the differential equation (5.6), with G given by (5.2).Then, F (ε, E(t)) decreases monotonically with t.
Proof.We consider first the simpler case where the non-negativity projection does not apply so that G = G(ε, E) (without P + ).Then (5.7) where the final estimate is given by the Cauchy-Bunyakovsky-Schwarz inequality.The derived inequality holds on the time interval without the change in the support of P + (so that no new edges are prohibited by the non-negativity projection).

Free Gradient Transition in the Outer
Level.The optimization problem in the inner level is non-convex due to the non-convexity of the functional F (ε, E).Hence, for a given ε, the computed minimizer E(ε) may depend on the initial guess E 0 = E 0 (ε).
The effects of the initial choice are particularly important upon the transition ε → ε = ε + ∆ε between constrained inner levels: given reasonably small ∆ε, one should expect relatively close optimizers E( ε) and E(ε), and, hence, the initial guess E 0 (ε) being close to and dependent on E(ε).
This choice, which seems very natural, determines however a discontinuity which may prevent the expected monotonicity property with respect to ε in the (likely unusual case) where F ( ε, E( ε)) < F (ε, E( ε)).This may happen in particular when ∆ε is not taken small; since the choice of ∆ε is driven by a Newton-like iteration we are interested in finding a way to prevent this situation and making the whole iterative method more robust.The goal is that of guaranteeing monotonicity of the functional both with respect to time and with respect to ε.When in the outer iteration we increase ε from a previous value ε < ε, we have the problem of choosing a suitable initial value for the constrained gradient system (5.6), such that at the stationary point E( ε) we have F ( ε, E( ε)) < F (ε, E(ε)) (which we assume both positive, that is on the left of the value ε which identifies the closest zero of the functional).
In order to maintain monotonicity with respect to time and also with respect to ε, it is convenient to start to look at the optimization problem with value ε, with the initial datum This means we have essentially to optimize with respect to the inequality constraint δW 1 ≤ ε, or equivalently solve the problem (now with inequality constrain on E F ): The scheme of alternating constrained (blue, E(t) ≡ 1) and free gradient (red) flows.Each stage inherits the final iteration of the previous stage as initial E0(εi) or E0(εi) respectively; constrained gradient is integrated till the stationary point ( ∇F (E) = 0), free gradient is integrated until δW1 = εi + ∆ε.The scheme alternates until the target functional vanishes (F (ε, E) = 0).
The situation changes only slightly from the one discussed above.If E < 1, every direction is admissible, and the direction of the steepest descent is given by the negative gradient.So we choose the free gradient flow (5.8) When E(t) = 1, then there are two possible cases.If P + G(ε, E), E ≥ 0, then the solution of (5.8) has and hence the solution of (5.8) remains of Frobenius norm at most 1.
Otherwise, if P + G(ε, E), E < 0, the admissible direction of steepest descent is given by the right-hand side of (5.6), and so we choose this ODE to evolve E. The situation can be summarized as taking, if E(t) = 1, (5.9) Along the solutions of (5.9), the functional F decays monotonically, and stationary points of (5.9) (i.e. points such that E If it can be excluded that the gradient P + G(ε, E(t)) vanishes at an optimizer, it can thus be concluded that the optimizer of the problem with inequality constraints is a stationary point of the gradient flow (5.6) for the problem with equality constraints.
Remark 5.8.As a result, F (ε, E(t)) monotonically decreases both with respect to time t and to the value of the norm ε, when ε ≤ ε .Algorithm 6.1 Pseudo-code of the complete constrained-and free-gradient flow.Require: initial edge perturbation guess E 0 ; initial ε 0 > 0; ε-stepsize ∆ε > 0; bounds α * , α * for the α-phase; for details see Section 7 2: while |F (ε, E)| < 10 −6 do 3: before the free gradient E < 1 5: E ← ConstrainedGradientFlow(E, ε) see Section 6 7: end while 6.Algorithm details.In Algorithm 6.1 we the pseudo-code of the whole bi-level procedure.The initial "α-phase" is used to choose an appropriate value for the regularization parameter α.In order to avoid the case in which the penalizing term on the right-hand side of (4.2) dominates the loss F (ε, E(t)) in the early stages of the descent flow, we select α by first running such an initial phase, prior to the main alternated constrained/free gradient loop.In this phase, we fix a small ε = ε 0 and run the constrained gradient integration for an initial α = α * .After the computation of a local optimum E * , we then increase α and rerun for the same ε 0 with E * as starting point.We iterate until no change in E * is observed or until α reaches an upper bound α * .
The resulting value of α and E * are then used in the main loop where we first increase ε by the chosen step size, we rescale E i by 0 < ε/(ε + ∆ε) < 1, and then we perform the free gradient integration described in Subsection 5.3 until we reach a new point E i on the unit sphere E i = 1.Then, we perform the inner constrained gradient step by integrating Equation (5.6), iterating the following two-step norm-corrected Euler scheme: (6.1) where the second step is necessary to numerically guarantee the Euler integration remains in the set of admissible flows since the discretization does not conserve the norm and larger steps h i may violate the non-negativity of the weights.
In both the free and constrained integration phases, since we aim to obtain the solution at t → ∞ instead of the exact trajectory, we favor larger steps h i given that the established monotonicity is conserved.Specifically, if F (ε, E i+1 ) < F (ε, E i ), then the step is accepted and we set h i+1 = βh i with β > 1; otherwise, the step is rejected and repeated with a smaller step h i ← h i /β.Remark 6.1.The step acceleration strategy described above, where βh i is immediately increased after one accepted step, may lead to "oscillations" between accepted and rejected steps in the event the method would prefer to maintain the current step size h i .To avoid this potential issue, in our experiments we actually increase the step length after two consecutive accepted steps.Alternative step-length selection strategies are also possible, for example, based on Armijo's rule or non-monotone stabilization techniques [18].
6.1.Computational costs.Each step of either the free or the constrained flows requires one step of explicit Euler integration along the anti-gradient −∇ E F (ε, E(t)).As discussed in Section 5, the construction of such a gradient requires several sparse and diagonal matrixvector multiplications as well as the computation of the smallest nonzero eigenvalue of both L up 1 (ε, E) and L 0 (ε, E).The latter two represent the major computational requirements of the numerical procedure.Fortunately, as both matrices are of the form A A, with A being either of the two boundary or co-boundary operators B 2 and B 1 , we have that both the two eigenvalue problems boil down to a problem of the form min x⊥ ker A Ax x i.e., the computation of the smallest singular value of the sparse matrix A. This problem can be approached by a sparse singular value solver based on a Krylov subspace scheme for the pseudo inverse of A A. In practice, we implement the pseudo inversion by solving the corresponding least squares problems which, in our experiments, we solved using the least square minimal-residual method (LSMR) from [15].This approach allows us to use a preconditioner for the normal equation corresponding to the least square problem.For simplicity, in our tests we used a constant preconditioner computed by means of an incomplete Cholesky factorization of the initial unperturbed L up 1 , or L 0 .Possibly, much better performance can be achieved with a tailored preconditioner that is efficiently updated throughout the matrix flow.We also note that the eigenvalue problem for the graph Laplacian L 0 (ε, E) may be alternatively approached by a combinatorial multigrid strategy [30].However, designing a suitable preconditioning strategy goes beyond the scope of this work and will be the subject of future investigations.
7. Numerical experiments.In this section, we provide several synthetic and real-world example applications of the proposed stability algorithm.The code for all the examples is available at https://github.com/COMPiLELab/HOLaGraF.All experiments are run until the global stopping criterion |F (ε, E(t))| < 10 −6 is met.The parameters µ and α are chosen as follows.Concerning µ, we set µ = 0.75µ 2 , where µ 2 is the smallest nonzero eigenvalue of the initial Laplacian L 0 .As for α, we run the α-phase described in Section 6 with parameters ε 0 = 10 −3 , α * = 1 and α * = 100.
The exemplary run of the optimization framework in time is shown on Figure 7.The top panel of Figure 7 provides the continued flow of the target functional F (ε, E(t)) consisting of the initial α-phase (in green) and alternated constrained (in blue) and free gradient (in orange) stages.As stated above, F (ε, E(t)) is strictly monotonic along the flow since the support of P + does not change.Since the initial setup is not pathological with respect to the connectivity, the initial α-phase essentially reduces to a single constrained gradient flow and terminates after one run with α = α * .The constrained gradient stages are characterized by a slow changing E(t), which is essentially due to the flow performing small adjustments to find We sample networks with a varying number of vertices N = 10, 16, 22, 28 and varying sparsity pattern ν = 0.35, 0.5 which determine the number of edges in the output as m = ν N (N −1)

2
. Due to the highly randomized procedure, topological structures of a sampled graph with a fixed pair of parameters may differ substantially, so 10 networks with the same (N, ν) pair are generated.For each network, the working time (without considering the sampling itself) and the resulted perturbation norm ε, and are reported in Figure 8b and Figure 8c, respectively.As anticipated in Subsection 6.1, we show the performance of two implementations of the method, one based on LSMR and one based on LSMR preconditioned by using the incomplete Cholesky factorization of the initial matrices.We observe that, • the computational cost of the whole procedure lies between O(m 2 ) and O(m 3 ) • denser structures, with a higher number of vertices, result in the higher number of edges being eliminated; at the same time, even most dense cases still can exhibit structures requiring the elimination of a single edge, showing that the flow does not necessarily favor multi-edge optima; • the required perturbation norm ε is growing with the size of the graph, Figure 8c, but not too fast: it is expected that denser networks would require larger ε to create a new hole; at the same time if the perturbation were to grow drastically with the sparsity ν, it would imply that the method tries to eliminate sufficiently more edges, a behavior that resembles convergence to a sub-optimal perturbation; • preconditioning with a constant incomplete Cholesky multiplier, computed for the initial Laplacians, provides a visible execution time gain for medium and large networks.Since the quality of the preconditioning deteriorates as the flow approaches the minimizer (as a non-zero eigenvalue becomes 0), it is worth investigating the design of a preconditioner for the up-Laplacian that can be efficiently updated.

Transportation
Networks.Finally, we provide an application to real-world examples based on city transportation networks.We consider networks for Bologna, Anaheim, Berlin Mitte, and Berlin Tiergarten; each network consists of nodes -intersections/public transport stops -connected by edges (roads) and subdivided into zones; for each road the free flow time, length, speed limit are known; moreover, the travel demand for each pair of nodes is provided through the dataset of recorded trips.All the datasets used here are publicly available at https://github.com/bstabler/TransportationNetworks;Bologna network is provided by the Physic Department of the University of Bologna (enriched through the Google Maps API https://developers.google.com/maps).
The regularity of city maps naturally lacks 3-cliques, hence forming the simplicial complex based on triangulations as done before frequently leads to trivial outcomes.Instead, here we "lift" the network to city zones, thus more effectively grouping the nodes in the graph.Specifically: 1. we consider the completely connected graph where the nodes are zones in the city/region; 2. the free flow time between two zones is temporarily assigned as a weight of each edge: the time is as the shortest path between the zones (by the classic Dijkstra algorithm) on the initial graph; 3. similarly to what is done in the filtration used for persistent homology, we filter out excessively distant nodes; additionally, we exclude the longest edges in each triangle in case it 1st original eigenflow 2nd original eigenflow  is equal to the sum of two other edges (so the triangle is degenerate and the trip by the longest edge is always performed through to others); 4. finally, we use the travel demand as an actual weight of the edges in the final network; travel demands are scaled logarithmically via the transformation w i → log 10 w i 0.95 min w i ; see the example on the left panel of Figure 9.Given the definition of weights in the network, high instability (corresponding to small perturbation norm ε) implies structural phenomena around the "almost-hole", where the faster and shorter route is sufficiently less demanded.
The results for four different networks are summarized in the Table 1; p mimics the percentile, ε/ e∈V 1 w i (e), showing the overall small perturbation norm contextually.At the same time, we emphasize that except Bologna (which is influenced by the geographical topology of the land), the algorithm does not choose the smallest weight possible; indeed, given our interpretation of the topological instability, the complex for Berlin-Tiergarten is Table 1 Topological instability of the transportation networks: filtered zone networks with the corresponding perturbation norm ε and its percentile among w1(•) profile.For each simplicial complex the number of nodes, edges and triangles in V2(K) are provided alongside the initial number of holes β1.The results of the algorithm consist of the perturbation norm, ε, computation time, and approximate percentile p. stable and the transportation network is effectively constructed.8. Discussion.In the current work, we formulate the notion of k-th order topological stability of a simplicial complex K as the smallest data perturbation required to create one additional k-th order hole in K.By introducing an appropriate weighting and normalization, the stability is reduced to a matrix nearness problem solved by a bi-level optimization procedure.Despite the highly nonconvex landscape, our proposed procedure alternating constrained and free gradient steps yields a monotonic descending scheme.Our experiments show that this approach is generally successful in computing the minimal perturbation increasing β 1 (ε, E), even for potentially difficult cases, as the one proposed in Subsection 7.1.
For simplicity, here we limit our attention to the smallest perturbation that introduces only one new hole.However, a simple modification may be employed to address the case of the introduction of m new holes: include the sum of m nonzero eigenvalues of L up 1 (ε, E) rather than just the first one in the spectral functional (4.2).We also remark that, due to the spectral inheritance principle 2.7, the proposed framework for H 1 can be in principle extended to a general H k however, this extension requires nontrivial considerations on the data modification procedure and on the numerical linear algebra tools, as a nontrivial topology of higher-order requires a much denser network.
Different improvements are possible in terms of numerical implementation, including investigating the use of more sophisticated (e.g.implicit) integrators for the gradient flow system (5.6), which would additionally require the use of higher-order derivatives of λ + (ε, E).Moreover, as already mentioned in Subsection 6.1, the numerical method for the computation of the small singular values would benefit from the use of an efficient preconditioner that can be effectively updated throughout the flow.Investigations in this direction are in progress and will be the subject of future work.

Figure 1 .
Figure 1.Left-hand side panel: example of simplicial complex K on 7 nodes, and of the action of ∂2 on the 2-simplex [1, 2, 3]; 2-simplices included in the complex are shown in red, arrows correspond to the orientation.Panels on the right: matrix forms B1 and B2 of boundary operators ∂1 and ∂2 respectively.

Figure 3 . 1 and L up 1
Figure 3. Illustration for the principal spectrum inheritance (Theorem 2.7) in case k = 0: spectra of L1, L down 1 and L up 1 are shown.Colors signify the splitting of the spectrum, λi > 0 ∈ σ(L1) ; all yellow eigenvalues are inherited from σ+(L0); red eigenvalues belong to the non-inherited part.Dashed barrier µ signifies the penalization threshold (see the target functional in Section 4) preventing homological pollution (see Subsection 3.1).

Figure 4 .
Figure 4. Example of the homological pollution, Example 3.2, for the simplicial complex K on 7 vertices; the existing hole is [2, 3, 4, 5] (left and center pane), all 3 cliques are included in the simplicial complex and shown in blue.The left pane demonstrates the initial setup with 1 hole; the center pane retains the hole exhibiting spectral pollution; the continuous transition to the eliminated edges with β1 = 0 (no holes) is shown on the right pane.
in the following two lemmas, we express the time derivative of the Laplacian L 0 and L up 1 as functions of E(t).The proofs of these results are straightforward and omitted for brevity.In what follows, Sym[A] denotes the symmetric part of the matrix A, namely Sym[A] = (A + A )/2.

Figure 6 . 31 Figure 7 .
Figure 6.Simplicial complex K on 8 vertices for the illustrative run (on the left): all 2-simplices from V2 are shown in blue, the weight of each edge w1(ei) is given on the figure.On the right: perturbed simplicial complex K through the elimination of the edge[5,6] creating additional hole[5,6,7,8].

Figure 9 .
Figure 9. Example of the Transportation Network for Bologna.Left pane: original zone graph where the width of edges corresponds to the weight, to-be-eliminated edge is colored in red.Right pane: eigenflows, original and created; color and width correspond to the magnitude of entries.