Move Schedules: Fast persistence computations in coarse dynamic settings

Matrix reduction is the standard procedure for computing the persistent homology of a filtered simplicial complex with $m$ simplices. Its output is a particular decomposition of the total boundary matrix, from which the persistence diagrams and generating cycles are derived. Persistence diagrams are known to vary continuously with respect to their input, motivating the study of their computation for time-varying filtered complexes. Computing persistence dynamically can be reduced to maintaining a valid decomposition under adjacent transpositions in the filtration order. Since there are $O(m^2)$ such transpositions, this maintenance procedure exhibits limited scalability and is often too fine for many applications. We propose a coarser strategy for maintaining the decomposition over a 1-parameter family of filtrations. By reduction to a particular longest common subsequence problem, we show that the minimal number of decomposition updates $d$ can be found in $O(m \log \log m)$ time and $O(m)$ space, and that the corresponding sequence of permutations -- which we call a schedule -- can be constructed in $O(d m \log m)$ time. We also show that, in expectation, the storage needed to employ this strategy is actually sublinear in $m$. Exploiting this connection, we show experimentally that the decrease in operations to compute diagrams across a family of filtrations is proportional to the difference between the expected quadratic number of states and the proposed sublinear coarsening. Applications to video data, dynamic metric space data, and multiparameter persistence are also presented.


Introduction
Given a triangulable topological space equipped with a tame continuous function, persistent homology captures the changes in topology across the sublevel sets of the space, and encodes them in a persistence diagram.The stability of persistence contends that if the function changes continuously, so too will the points on the persistence diagram [21,22].This motivates the application of persistence to time-varying settings, like that of dynamic metric spaces [30].As persistence-related computations tend to exhibit high algorithmic complexity-essentially cubic 4 in the size of the underlying filtration [39]-their adoption to dynamic settings poses a challenging computational problem.Currently, there is no recourse when faced with a time-varying complex containing millions of simplices across thousands of snapshots in time.Acquiring such a capability has far-reaching consequences: methods that vectorize persistence diagrams for machine learning purposes all immediately become computationally viable tools in the dynamic setting.Such persistence summaries include adaptive template functions [43], persistence images [2], and α-smoothed Betti curves [46].
Cohen-Steiner et al. refer to a continuous 1-parameter family of persistence diagrams as a vineyard, and they give in [22] an efficient algorithm for their computation.The vineyards approach can be interpreted as an extension of the reduction algorithm [48], which computes the persistence diagrams of a filtered simplicial complex K with m simplices in O(m 3 ) time, via a particular decomposition R = DV (or RU = D) of the boundary matrix D of K.The vineyards algorithm, in turn, transforms a time-varying filtration into a certain set of permutations of the decomposition R = DV , each of which takes at most O(m) time to execute.If one is interested in understanding how the persistent homology of a continuous function changes over time, then this algorithm is sufficient, for homological critical points can only occur when the filtration order changes.Moreover, the vineyards algorithm is efficient asymptotically: if there are d time-points where the filtration order changes, then vineyards takes O(m 3 + md) time; one initial O(m 3 )time reduction at time t 0 followed by one O(m) operation to update the decomposition at the remaining time points (t 1 , t 2 , . . ., t d ).When d >> m, the initial reduction cost is amortized by the cost of maintaining the decomposition, implying each diagram produced takes just linear time per time point to obtain.Despite its theoretical efficiency, vineyards is often not the method of choice in practical settings.While there is an increasingly rich ecosystem of software packages offering variations of the standard reduction algorithm (e.g.Ripser, PHAT, Dionysus, etc. see [41] for an overview), implementations of the vineyards algorithm are relatively uncommon. 5The reason for this disparity is perhaps explained by Lesnick and Wright [35]: "While an update to an RU decomposition involving few transpositions is very fast in practice... many transpositions can be quite slow... it is sometimes much faster to simply recompute the RU -decomposition from scratch using the standard persistence algorithm."Indeed, they observe that maintaining the decomposition along a certain parameterized family is the most computationally demanding aspect of RIVET [44], a software for computing two-parameter persistent homology.
The work presented here seeks to further understand and remedy this discrepancy: building on the work presented in [15], we introduce a coarser approach to the vineyards algorithm.Though the vineyards algorithm is efficient at constructing a continuous 1-parameter family of diagrams, it is not necessarily efficient when the parameter is coarsely discretized.Our methodology is based on the observation that practitioners often don't need (or want!) all the persistence diagrams generated by a continuous 1-parameter of filtrations; usually just n << d of them suffice.By exploiting the "donor" concept introduced in [15], we are able to make a trade-off between the number of times the decomposition is restored to a valid state and the granularity of the decomposition repair step, reducing the total number of column operations needed to apply an arbitrary permutation to the filtration.This trade-off, paired with a fast greedy heuristic explained in section 3.4.2,yields an algorithm that can update a R = DV decomposition more efficiently than vineyards in coarse time-varying contexts, making dynamic persistence more computationally tractable for a wider class of use-cases.Both the source code containing the algorithm we propose and the experiments performed in Section 4 are available open source online. 6

Related Work
To the author's knowledge, work focused on ways of updating a decomposition R = DV , for all homological dimensions, is limited: there is the vineyards algorithm [22] and the moves algorithm [15], both of which are discussed extensively in section 2. At the time of writing, we were made aware of very recent work [38] that iteratively repairs a permuted decomposition via a column swapping strategy, which they call "warm starts."Though their motivation is similar to our own, their approach relies on the reduction algorithm as a subprocedure, which is quite different from the strategy we employ here.
Contrasting the dynamic setting, there is extensive work on improving the efficiency of computing a single (static) R = DV decomposition.Chen [18] proposed persistence with a twist, also called the clearing optimization, which exploits a boundary/cycle relationship to "kill" columns early in the reduction rather than reducing them.Another popular optimization is to utilize the duality between homology and cohomology [23], which dramatically improves the effectiveness of the clearing optimization [5].There are many other optimizations on the implementation side: the use of ranking functions defined on the combinatorial number system enables implicit cofacet enumeration, removing the need to store the boundary matrix explicitly; the apparent/emergent pairs optimization identifies columns whose pivot entries are unaffected by the reduction algorithm, reducing the total number of columns which need to be reduced; sparse data structures such as bit-trees and lazy heaps allow for efficient column-wise additions with Z 2 = Z/2Z coefficients and effective O(1) pivot entry retrieval, and so on [5,6].
By making stronger assumptions on the underlying topological space, restricting the homological dimension, or targeting a weaker invariant (e.g.Betti numbers), one can usually obtain faster algorithms.For example, Attali et al. [3] give a linear time algorithm for computing persistence on graphs.In the same paper, they describe how to obtain ϵ-simplifications of 1-dimensional persistence diagrams for filtered 2-manifolds by using duality and symmetry theo-rems.Along a similar vein, Edelsbrunner et al. [27] give a fast incremental algorithm for computing persistent Betti numbers up to dimension 2, again by utilizing symmetry, duality, and "time-reversal" [24].Chen & Kerber [19] give an output-sensitive method for computing persistent homology, utilizing the property that certain submatrices of D have the same rank as R, which they exploit through fast sub-cubic rank algorithms specialized for sparse-matrices.
If zeroth homology is the only dimension of interest, computing and updating both the persistence and rank information is greatly simplified.For example, if the edges of the graph are in filtered order a priori, obtaining a tree representation fully characterizing the connectivity of the underlying space (also known as the incremental connectivity problem) takes just O(α(n)n) time using the disjoint-set data structure, where α(n) is the extremely slow-growing inverse Ackermann function.Adapting this approach to the time-varying setting, Oesterling et al. [40] give an algorithm that maintains a merge tree with e edges in O(e) time per-update.If only Betti numbers are needed, the zerothdimension problem reduces even further to the dynamic connectivity problem, which can be efficiently solved in amortized O(log n) query and update times using either Link-cut trees or multi-level Euler tour trees [29].

A Motivating Example
To motivate this effort, we begin with an illustrative example of why the vineyards algorithm does not always yield an efficient strategy for time-varying settings.Consider a series of grayscale images (i.e. a video) depicting a fixed-width annulus expanding about the center of a 9 × 9 grid, and its associated sublevel-set filtrations, as shown in Figure 1.Each image in the series consists of pixels whose intensities vary with time, upon which we build a simplicial complex using the Freudenthal triangulation of the plane.For each complex, we create a filtration of simplices whose order is determined by the lower stars of pixel values.Two events critically change the persistence diagrams: the first occurs when the central connected component splits to form a cycle, and the second when the annulus splits into four components.From left to right, the ϵ-persistent Betti numbers 7 of the five evenly spaced 'snapshots' of the filtration shown in Figure 1 are: (β 0 , β 1 ) = (1, 0), (1, 1), (1, 1), (1, 1), (4, 0).Thus, in this example, only a few persistence diagrams are needed to capture the major changes to the topology.
We use this data set as a baseline for comparing vineyards and the standard reduction algorithm pHcol (Algorithm 4).Suppose a practitioner wanted to know the major homological changes a time-varying filtration encounters over time.Since it is unknown a priori when the persistent pairing function changes, one solution is to do n independent persistence computations at n evenly spaced points in the time domain.An alternative approach is to construct a homotopy between a pair of filtrations (K, f ), (K, f ′ ) and then decompose this homotopy into adjacent transpositions based on the filtration order-the vineyards approach.We refer to the former as the discrete setting, which is often used in practice, and the latter as the continuous setting.Note that though the discrete setting is often more practical, it is not guaranteed to capture all homological changes in persistence that occur in the continuous 1-parameter family of diagrams.
The cumulative cost (in total column operations) of these various approaches are shown in Figure 2, wherein the reduction (pHcol) and vineyard algorithms are compared.Two discrete strategies (green and purple) and two continuous strategies (black and blue) are shown.
Note that without knowing where the persistence pairing function changes, a continuous strategy must construct all ≈ 7 × 10 4 diagrams induced by the homotopy.In this setting, as shown in the figure, the vineyards approach is indeed far more efficient than naively applying the reduction algorithm independently at all time points.However, when the discretization of the time domain is coarse enough, the naive approach actually performs less column operations than vineyards, while still capturing the main events.The existence of a time discretization that is more efficient to compute than continually updating the decomposition indicates that the vineyards framework must incur some overhead (in terms of column operations) to maintain the underlying decomposition, even when the pairing function determining the persistence diagram is unchanged.Indeed, as shown by the case where n = 10, applying pHcol independently between relatively "close" filtrations is substantially more efficient than iteratively updating the decomposition.Moreover, any optimizations to the reduction algorithm (e.g.clearing [18]) would only increase this disparity.Since persistence has found many applications in dynamic contexts [45,47,35,30], a more efficient alternative to vineyards is clearly needed.
Our approach and contributions are as follows: First, we leverage the moves framework of Busaryev et al. [15] to include coarser operations for dynamic persistence settings.By a reduction to an edit distance problem, we give a lower bound on the minimal number of moves needed to perform an arbitrary permutation to the R = DV decomposition, along with a proof of its optimality.We also give worst-case sizes of these quantities in expectation as well as efficient algorithms for constructing these operations-both of which are derived from a reduction to the Longest Increasing Subsequence (LIS) problem.These operations parameterize sequences of permutations S = (s 1 , s 2 , . . ., s d ) of minimal size d, which we call schedules.However, not all minimal size schedules incur the same cost (i.e., number of column operations).We investigate the feasibility of choosing optimal cost schedules, and show that greedy-type approaches can lead to arbitrarily bad behavior.In light of these results, we give an alternative proxy-objective for cost minimization, provide bounds justifying its relevance to the original problem, and give an efficient O(d 2 log m) algorithm for heuristically solving this proxy minimization.A performance comparison with other reduction-based persistence computations is given, wherein move schedules are demonstrated to be an order of magnitude more efficient than existing approaches at calculating persistence in dynamic settings.In particular, we illustrate the effectiveness of efficient scheduling with a variety of real-world applications, including flock analysis in dynamic metric spaces and manifold detection from image data using 2D persistence computations.

Main results
Given a simplicial complex K with filtration function f , denote by R = DV the decomposition of its corresponding boundary matrix D such that R is reduced and V is upper-triangular (see section 2.1 for details).If one has a pair of filtrations (K, f ), (K, f ′ ) of size m = |K| and R = DV has been computed for (K, f ), then it may be advantageous to use the information stored in (R, V ) to reduce the computation of R ′ = D ′ V ′ .Given a permutation P such that D ′ = P DP T , such an update scheme has the form: where * is substituted with elementary column operations that repair the permuted decomposition.It is known how to linearly interpolate f → f ′ using d ∼ O(m 2 ) updates to the decomposition, where each update requires at most two column operations [22].Since each column operation takes O(m), the complexity of re-indexing f → f ′ is O(m 3 ), which is efficient if all d decompositions are needed.Otherwise, if only (R ′ , V ′ ) is needed, updating R → R ′ using the approach from [22] matches the complexity of computing R ′ = D ′ V ′ independently.
We now summarize our main results (Theorem 1): suppose one has a schedule S = (s 1 , s 2 , . . ., s d ) yielding a corresponding sequence of decompositions: where s k = (i k , j k ) for k = 1, . . ., d, denotes a particular type of cyclic permutation (see section 3.2).If i k < j k for all s k ∈ S, our first result extends [15] by showing that (1) can be computed using O(ν) column operations, where: The quantities |I k | and |J k | depend on the sparsity of the matrices V k and R k , respectively, and d ∼ O(m) is a constant that depends on how similar f and f ′ are.As this result depends explicitly on the sparsity pattern of the decomposition itself, it is an output sensitive bound.
Our second result turns towards lower bounding d = |S| and the complexity of constructing S itself.By reinterpreting a special set of cyclic permutations as edit operations on strings, we find that any sequence mapping f to f ′ of minimal size must have length (Proposition 3): where LCS(f, f ′ ) refers to the size of the longest common subsequence between the simplexwise filtrations (K, f ) and (K, f ′ ) (see section 3.2 for more details).We also show that the information needed to construct any S with optimal size can be computed in O(m log log m) preprocessing time and O(m) memory.We provide evidence that d ∼ m − √ m in expectation for random filtrations (Corollary 2).Although this implies d can be O(m) for pathological inputs, we give empirical results suggesting d can be much smaller in practice.
Outline: The paper is organized as follows: we review and establish the notations we will use to describe simplicial complexes, persistent homology, and dynamic persistence in Section 2. We also cover the reduction algorithm (designated here as pHcol), the vineyards algorithm, and the set of move-related algorithms introduced in [15], which serves as the starting point of this work.In Section 3 we introduce move schedules, and provide efficient algorithms to construct them.In Section 4 we present applications of the proposed method, including applications related to dynamic metric spaces and 2-parameter persistence.In Section 5 we conclude the paper by discussing other possible applications and future work.

Background
Suppose one has a family {K i } i∈I of simplicial complexes indexed by a totally ordered set I, and so that for any i < j ∈ I we have K i ⊆ K j .Such a family is called a filtration, which is deemed simplexwise if K j ∖ K i = {σ j } whenever j is the immediate successor of i in I. Any finite filtration may be trivially converted into a simplexwise filtration via a set of condensing, refining, and reindexing maps (see [5] for more details).Equivalently, a filtration can be also defined as a pair (K, f ) where K is a simplicial complex and f : Here, we consider two index sets: [m] = {1, . . ., m} and R. Without loss of generality, we exclusively consider simplexwise filtrations, but for brevity-sake refer to them simply as filtrations.
Let K be an abstract simplicial complex and F a field.A p-chain is a formal F-linear combination of p-simplices of K.The collection of p-chains under addition yields an F-vector space denoted C p (K).The p-boundary ∂ p (σ) of a p-simplex σ ∈ K is the alternating sum of its oriented co-dimension 1 faces, and the p-boundary of a p-chain is defined linearly in terms of its constitutive simplices.A p-chain with zero boundary is called a p-cycle, and together they form Z p (K) = Ker ∂ p .Similarly, the collection of p-boundaries forms B p (K) = Im ∂ p+1 .Since ∂ p • ∂ p+1 = 0 for all p ≥ 0, then the quotient space H p (K) = Z p (K)/B p (K) is well-defined, and called the p-th homology of K with coefficients in F. If f : K → [m] is a filtration, then the inclusion maps K i ⊆ K i+1 induce linear transformations at the level of homology: ) Simplices whose inclusion in the filtration creates a new homology class are called creators, and simplices that destroy homology classes are called destroyers.The filtration indices of these creators/destroyers are referred to as birth and death times, respectively.The collection of birth/death pairs (i, j) is denoted dgm p (K, f ), and referred to as the p-th persistence diagram of (K, f ).If a homology class is born at K i and dies entering K j , the difference |i − j| is called the persistence of that class.In practice, filtrations often arise from triangulations parameterized by geometric scaling parameters, and the "persistence" of a homology class refers to its lifetime with respect to the scaling parameter.
Let X be a triangulable topological space; that is, so that there exists an abstract simplicial complex K whose geometric realization is homeomorphic to X. Let f : X → R be continuous and write X a = f −1 (−∞, a] to denote the sublevel sets of X defined by the value a.A homological critical value of f is any value a ∈ R such that the homology of the sublevel sets of f changes at a, i.e. if for some p the inclusion-induced homomorphism H p (X a−ϵ ) → H p (X a+ϵ ) is not an isomorphism for any small enough ϵ > 0. If there are only finitely many of these homological critical values, then f is said to be tame.The concept of homological critical points and tameness will be revisited in section 2.2.

The Reduction Algorithm
In this section, we briefly recount the original reduction algorithm introduced in [48], also sometimes called the standard algorithm or more explicitly pHcol [23].The pseudocode is outlined in Algorithm 4 in the appendix.Without optimizations, like clearing or implicit matrix reduction, the standard algorithm is very inefficient.Nonetheless, it serves as the foundation of most persistent homology implementations, and its invariants are necessary before introducing both vineyards in section 2.2 and our move schedules in section 3.
Given a filtration (K, f ) with m simplices, the output of the reduction algorithm is a matrix decomposition R = DV , where the persistence diagrams are encoded in R and the generating cycles in the columns of V .To begin the reduction, one first assembles the elementary boundary chains ∂(σ) as columns ordered according to f into a m × m filtration boundary matrix D. Setting V = I and R = D, one proceeds by performing elementary left-to-right column operations on V and R until the following invariants are satisfied: Decomposition Invariants: where low R (i) denotes the largest row index of a non-zero entry in column i of R. We call the decomposition satisfying these three invariants valid.The persistence diagrams of the corresponding filtration can be determined from the lowest entries in R. Note that though R and V are not unique, the collection of persistent pairings are [48].
It is at times more succinct to restrict to specific sub-matrices of D based on the homology dimension p, and so we write D p to represent the d p−1 × d p matrix representing ∂ p (the same notation is extended to R and V ).We illustrate the reduction algorithm with an example below.
Example 2.1: Consider a triangle with vertices u, v, w, edges a = (u, w), b = (v, w), c = (u, v), and whose filtration order is given as (u, v, w, a, b, c).Using Z 2 coefficients, the reduction proceeds to compute (R 1 , V 1 ) as follows: Since column c in R 1 is 0, the 1-chain indicated by the column c in V 1 represents a dimension 1 cycle.Similarly, the columns at u, v, w in R 0 (not shown) are all zero, indicating three 0-dimensional homology classes are born, two of which are killed by the pivot entries in columns a and b in R 1 .
Inspection of the algorithm from [27] suggests an upper bound for the reduction is O(m 3 ), where m is the number of simplices of the filtration-this bound is in fact tight [39].Despite its high algorithmic complexity, many variations and optimizations to Algorithm 4 have been proposed over the past decade, see [5,6,18] for an overview.

Vineyards
Consider a homotopy F (x, t) : X × [0, 1] → R on a triangulable topological space X, and denote its "snapshot" at a given time-point t by f t (x) = F (x, t).The snapshot f 0 denotes the initial function at time t = 0 and f 1 denotes the function at the last time step.As t varies in [0, 1], the points in dgm p (f t ) trace curves in R 3 which, by the stability of persistence, will be continuous if F is continuous and the f t 's are tame.Cohen-Steiner et al. [21] referred to these curves as vines, a collection of which forms as vineyard-the geometric analogy is meant to act as a guidepost for practitioners seeking to understand the evolution of topological structure over time.
The original purpose of vineyards, as described in [22], was to compute a continuous 1-parameter family of persistence diagrams over a time-varying filtration, detecting homological critical events along the way.As homological critical events only occur when the filtration order changes, detecting all such events may be reduced to computing valid decompositions at time points interleaving all changes in the filtration order.For simplexwise filtrations, these changes manifest as transpositions of adjacent simplices, and thus any fixed set of rules that maintains a valid R = DV decomposition under adjacent column transpositions is sufficient to compute persistence dynamically.
To ensure a decomposition is valid, these rules prescribe certain column and row operations to apply to a given matrix decomposition either before, during, or after each transposition.Formally, let S j i represent the upper-triangular matrix such that AS j i results in adding column i of A to column j of A, and let S j i A be the same operation on rows i and j.Similarly, let P denote the matrix so that AP T permutes the columns of A and P A permutes the rows.Since the columns of P are orthonormal, P −1 = P T , then P AP T performs the same permutation to both the columns and rows of A. In the special case where P represents a transposition, we have P = P T and may instead simply write P AP .The goal of the vineyards algorithm can now be described explicitly: to prescribe a set of rules, written as matrices S j i , such that if R = DV is a valid decomposition, then ( * P * R * P * ) = (P DP )( * P * V * P * ) is also a valid decomposition, where * is some number (possibly zero) of matrices encoding elementary column or row operations.
Example 2.2 To illustrate the basic principles of vineyards, we re-use the running example introduced in the previous section.Below, we illustrate the case of exchanging simplices a and b in the filtration order, and restoring RV to a valid decomposition.
Starting with a valid reduction R = DV and prior to performing the exchange, observe that the highlighted entry in V 1 would render V 1 non-upper triangular after the exchange.This entry is removed by a left-to-right column operation, given by applying S 2 1 on the right to R 1 and V 1 .After this operation, the permutation may be safely applied to V 1 .Both before and after the permutation P , R 1 is rendered non-reduced, requiring another column operation to restore the decomposition to a valid state.
The time complexity of vineyards is determined entirely by the complexity of performing adjacent transpositions.Since column operations are the largest complexity operations needed and each column can have potentially O(m) entries, the complexity of vineyards is O(m) per transposition.Inspection of the individual cases of the algorithm from [22] shows that any single transposition requires at most two such operations on both R and V .However, several factors can affect the runtime efficiency of the vineyards algorithm.On the positive side, as both the matrices R and V are often sparse, the cost of a given column operation is proportional to the number of non-zero entries in the two columns being modified.Moreover, as a rule of thumb, it has been observed that most transpositions require no column operations [27].On the negative side, one needs to frequently query the non-zero status of various entries in R and V (consider evaluating e.g.Case 1.1 in [22]), which accrues a non-trivial runtime cost due to the quadratic frequency with which they are required.

Moves
Originally developed to accelerate tracking generators with temporal coherence, Busaryev et al. [15] introduced an extension of the vineyards algorithm which maintains a R = DV decomposition under move operations.A move operation Move(i, j) is a set of rules for maintaining a valid decomposition under the permutation P that moves a simplex σ i at position i to position j.If j = i ± 1, this operation is an adjacent transposition, and in this sense moves generalizes vineyards.However, the move framework presented by Busaryev is actually distinct in that it exhibits several attractive qualities not inherited by the vineyards approach that warrants further study.
For completeness, we recapitulate the motivation of the moves algorithm from [15].Let f : K → [m] denote a filtration of size m = |K| and R = DV its decomposition.Consider the permutation P that moves a simplex σ i in K to position j, shifting all intermediate simplices σ i+1 , . . ., σ j down by one (i < j).To perform this shift, all entries V ik ̸ = 0 with column positions k ∈ [i + 1, j] need to set to zero, otherwise P V is not upper-triangular.We may zero these entries in V using column operations V ( * ), ensuring invariant I2 (2.1) is maintained, however these operations may render R ′ = P R( * )P T unreduced, breaking invariant I3.Of course, we could then reduce R ′ with additional column operations, but the number of such operations scales O(|i − j| 2 ), but this is no more efficient than simply performing the permutation and applying the reduction algorithm to columns [i, j] in R.
To bypass this difficulty, Busaryev et al. observed that if R contains s pivot entries in the columns [i, j], then since it is reduced, R ′ must also have s pivots.Thus, if column operations render some pivot-column r k of R unreduced, then its pivot entry low R (k) becomes free8 -if r k is copied prior to modification, we may re-use or donate its pivot entry to a later column r k+1 , . . ., r j .Repeating this process at most j − i − 1 times ensures R ′ stays reduced in all except possibly at its i-th column.Moreover, since the k-th such operation simultaneously sets v ik = 0, V ′ retains its upper-triangularity.
Example 2.3: We re-use the running example from sections 2.1 and 2.2 to illustrate moves.The donor columns of R and V are denoted as d R and d V , respectively.Consider moving edge a to the position of edge c in the filtration.
Note that the equivalent permutation using vineyards requires 4 column operations on both R 1 and V 1 , respectively, whereas a single move operation accomplishes using only 2 column operations per matrix.The pseudocode for MoveRight is given in Algorithm 1 and for MoveLeft in Algorithm 2.
Algorithm 1 Move Right Algorithm ( for k in I 2 , . . ., I s do 4: ( R, V ) ← ( P RP T , P V P T ) 7: Regarding the complexity of move operations, which clearly depend on the sparsity of R and V , we recall the proposition shown in [15]: while low R (l) ̸ = 0 and low R (r) ̸ = 0 do 4: return ( R, V ) 2: 5: (R, V ) ← (P RP T , P V P T ) 7: Proposition 1 (Busaryev et al. [15]).Given a filtration with n simplices of dimensions p − 1, p, and p + 1, let R = DV denote its associated decomposition.Then, the operation MoveRight(i, j) constructs a valid decomposition where I, J are given by: Moreover, the quantity Though similar to vineyards, move operations confer additional advantages: M1: Querying the non-zero status of entries in R or V occurs once per move.
M2: R = DV is not guaranteed to be valid during the movement of σ i to σ j .
M3: At most O(m) moves are needed to reindex f → f ′ First, consider property M1.Prior to applying any permutation P to the decomposition, it is necessary to remove non-zero entries in V which render P T V P non-upper triangular, to maintain invariant I2.Using vineyards, one must consistently perform |i − j| − 1 non-zero status queries interleaved between repairing column operations.A move operation groups these status queries into a single pass prior to performing any modifying operations.
Property M2 implies that the decomposition is not fully maintained during the execution of RestoreRight and Re-storeLeft below, which starkly contrasts the vineyards algorithm.In this way, we interpret move operations as making a tradeoff in granularity: whereas a sequence of adjacent transpositions (i, i+1), (i+1, i+2), . . ., (j−1, j) generates |i − j| valid decompositions in vineyards, an equivalent move operation Move(i, j) generates only one.Indeed, Property M3 directly follows from this fact, as one may simply move each simplex σ ∈ K into its new order f ′ (σ) via insertion sort.Note that the number of valid decompositions produced by vineyards is bounded above by O(m 2 ), as each pair of simplices σ i , σ j ∈ K may switch its relative ordering at most once during the interpolation from f to f ′ .
As shown by example 2.3, moves can be cheaper than vineyards in terms of column operations.However, it is not clear that this is always the case upon inspection of Algorithm 1, as the usage of a donor column seemingly implies that many O(m) copy operations need to be performed.It turns out that we may handle all such operations except the first in O(1) time, which we formalize below.
where s i denotes the transposition (i, i + 1), i < j, and k = |i − j|.Moreover, let M denote the number of column operations to perform the same update R → R ′ with Move(i, j).Then the inequality M ≤ T holds.
Proof.First, consider executing the vineyards algorithm with a given pair (i, j).As there are at most 2 column operations, any contiguous sequence of transpositions (i, i + 1), (i + 1, i + 2), . . .As a final remark, we note that the combination of MoveRight and MoveLeft enable efficient simplex additions or deletions to the underlying complex.In particular, given K and a decomposition R = DV , obtaining a valid decomposition R ′ = D ′ V ′ of K ′ = K ∪ {σ} can be achieved by appending its requisite elementary chains to D and V , reducing them, and then executing MoveLeft(m + 1, i) with i = f ′ (σ).Dually, deleting a simplex σ i may be achieved via MoveRight by moving i-th to the end of the decomposition and dropping the corresponding columns.

Our contribution: Move Schedules
We begin with a brief overview of the pipeline to compute the persistence diagrams of a discrete 1-parameter family (f 1 , f 2 , . . ., f n ) of filtrations.Without loss of generality, assume each filtration f i : K → [m] is a simplexwise filtration of a fixed simplicial complex K with |K| = m, and let (f, f ′ ) denote any pair of such filtrations.Our strategy to efficiently obtain a valid RV decomposition of the filtration (K, f ′ ) from a given decomposition of (K, f ) is to decompose a fixed bijection ρ : Given a pair of filtrations (K, f ), (K, f ′ ) and R = DV the initial decomposition of (K, f ), a schedule S = (s 1 , s 2 , . . ., s d ) is a sequence of permutations satisfying: where, for each i ∈ [d], R i = D i V i is a valid decomposition respecting invariants 2.1, and R ′ is a valid decomposition for (K, f ′ ).
To produce this sequence of permutations S = (s 1 , . . ., s d ) from ρ, our approach is as follows: define q = (ρ(1), ρ(2), . . ., ρ(m)).We compute a longest increasing subsequence LIS(q), and then use this subsequence to recover a longest common subsequence (LCS) between f and f ′ , which we denote later with LCS(f, f ′ ).We pass q, LIS(q), and a "greedy" heuristic for picking moves h to our scheduling algorithm (to be defined later), which returns as output an ordered set of move permutations S of minimum size satisfying (5)-a schedule for the pair (f, f ′ ).These sequence of steps is outlined in Algorithm 3 below.

Algorithm 3 Scheduling algorithm
Require: h ← greedy ▷ Heuristic for picking moves 6: S ← Schedule(q, lis q , h) ▷ O(d 2 log m), see Section 3. for (i, j) in S do Though using the LCS between a pair of permutations induced by simplexwise filtrations is a relatively intuitive way of producing a schedule of move-type decomposition updates, it is not immediately clear whether such an algorithm has any computational benefits compared to vineyards.Indeed, since moves is a generalization of vineyards, several important questions arise when considering practical aspects of how to implement Algorithm 3, such as e.g.how to pick a "good" heuristic h, how expensive can the Schedule algorithm be, or how |S| scales with respect to the size of the complex K.We address these issues in the following sections.

Continuous setting
In the vineyards setting, a given homotopy is discretized into a set of critical events that alter the filtration order.As F determines the number of distinct filtrations encountered during the deformation from f to f ′ , a natural question is whether such an interpolation can be modified so as to minimize |S|-the number of times the decomposition is restored to a valid state.Towards explaining the phenomenon exhibited in Figure 2, we begin by analyzing a class of interpolation schemes to establish an upper bound on this quantity.
) for every σ ∈ K.Note that this family includes the straight-line homotopy , studied in the original vineyards paper [22].If we assume that each pair of curves t, F (•, t) ⊂ [0, 1] × R intersect in at most one point-at which they cross-the continuity and genericity assumptions on F imply that for σ, µ ∈ K distinct, the curves t → F (σ, t) and t → F (µ, t) In other words, the number of crossings in F is exactly the Kendall-τ distance [25] between f and f ′ : After slightly perturbing F if necessary, we can further assume that its crossings occur at The continuity of the curves t → F (•, t) and the fact that t i is the sole crossing time in the interval (t i−1 , t i+1 ), imply that the permutation ρ i transforming f i−1 into f i , i.e. so that f i = ρ i • f i−1 , must be (in cycle notation) of the form ρ i = (ℓ i ℓ i + 1) for 1 ≤ ℓ i < m.In other words, ρ i is an adjacent transposition for each i = 1, . . ., k. Observe the size of the ordered sequence of adjacent transpositions S F = (ρ 1 , ρ 2 , . . ., ρ k ) defined from the homotopy F above is exactly K τ (f, f ′ ).On the positive side, the reduction of schedule planning to crossing detection implies the former can be solved optimally in output-sensitive O(m log m + k) time by several algorithms [13], where k is the outputsensitive term and m is the number of simplices in the filtration(s).On the negative side, k = K τ (f, f ′ ) scales in size to ∼ O(m 2 ) in the worst case, achieved when f ′ = −f .As mentioned in 2.2, this quadratic scaling induces a number of issues in the practical implementations of the vineyards algorithm Remark 1.The grayscale image data example from section 1.2 exhibits this quadratic scaling.Indeed, the Freudenthal triangulation of the 9 × 9 grid contains (81, 208, 128) simplices of dimensions (0, 1, 2), respectively.Therefore, m = 417 and |S F | ≤ 1 2 m(m − 1) = 86, 736.As the homotopy given by the video is varied, ≈ 70,000 transpositions are generated, approaching the worst case upper bound due to the fact that f ′ is nearly the reverse of f .If our goal is to decrease |S F |, one option is to coarsen S F to a new schedule S F by e.g.collapsing contiguous sequences of adjacent transpositions to moves, via the map and the associated coarsened S F requires just O(m) time to compute.However, the coarsening depends entirely on the initial choice of F and the quadratic upper bound remains-it is always possible that there are no contiguous subsequences to collapse.This suggests one must either abandon the continuous setting or make stronger assumptions on F to have any hope of keeping |S F | ∼ O(m) in size.

Discrete setting
Contrasting the continuous-time setting, if we discard the use of a homotopy interpolation and allow move operations in any order, we obtain a trivial upper bound of O(m) on the schedule size: simply move each simplex in K from its position in the filtration given by f to the position given by f ′ -which we call the naive strategy.In particular, in losing the interpolation interpretation, it is no longer clear the O(m) bound is tight.Indeed, the "intermediate" filtrations need no longer even respect the face poset of the underlying complex K.In this section, we investigate these issues from a combinatorial perspective.
Let S m denote the symmetric group.Given two fixed permutations p, q ∈ S m and a set of allowable permutations Σ ⊆ S m , a common problem is to find a sequence of permutations s 1 , s 2 , . . ., s d ∈ Σ whose composition satisfies: Common variations of this problem include finding such a sequence of minimal length (d) and bounding the length d as a function of m.In the latter case, the largest lower bound on d is referred to as the distance between p and q with respect to Σ.A sequence S = (s 1 , s 2 , . . ., s d ) of operations s ∈ Σ ⊆ S m mapping p → q is sometimes called a sorting of p.When p, q are interpreted as strings, these operations s ∈ Σ are called edit operations.The minimal number of edit operations d Σ (p, q) needed to sort p → q with respect to Σ is referred to as the edit distance [7] between p and q.We denote the space of sequences transforming p → q using d permutations in Σ ⊆ S m with Φ Σ (p, q, d).
Note the choice of Σ defines the type of distance being measured-otherwise if Σ = S m , then d Σ (p, q) = 1 trivially for any p ̸ = q ∈ S m .
Perhaps surprisingly, small changes to the set of allowable edit operations Σ dramatically affect both the size of d Σ (p, q) and the difficulty of obtaining a minimal sorting.For example, while sorting by transpositions and reversals is NP-hard and sorting by prefix transpositions is unknown, there are polynomial time algorithms for sorting by block interchanges, exchanges, and prefix exchanges [32].Sorting by adjacent transpositions can be achieved in many ways: any sorting algorithm that exchanges two adjacent elements during its execution (e.g.bubble sort, insertion sort) yields a sorting of size K τ (p, q).
Here we consider sorting by moves.Using permutations, a move operation m ij that moves i to j in [m], for i < j, corresponds to the circular rotation: In cycle notation, this corresponds to the cyclic permutation m ij = ( i j j-1 . . .i+2 i+1 ).Observe that a move operation can be interpreted as a paired delete-and-insert operation, i.e. m ij = (ins j • del i ), where del i denotes the operation that deletes the character at position i and ins j the operation that inserts the same character at position j.Thus, sorting by move operations can be interpreted as finding a minimal sequence of edits where the only operations allowed are (paired) insertions and deletions-this is exactly the well known Longest Common Subsequence (LCS) distance.Between strings p, q of sizes m and n, the LCS distance is given by [7]: With this insight in mind, we obtain the following bound on the minimum size of a sorting (i.e.schedule) using moves and the complexity of computing it.
Proposition 3 (Schedule Size).Let (K, f ), (K, f ′ ) denote two filtrations of size |K| = m.Then, the smallest move schedule S * re-indexing f → f ′ has size: where we use LCS(f, f ′ ) to denote the LCS of the permutations of K induced by f and f ′ .
Proof.Recall our definition of edit distance given above, depending on the choice Σ ⊆ S m of allowable edit operations, and that in order for any edit distance to be symmetric, if s ∈ Σ then s −1 ∈ Σ.This implies that d Σ (p, q) = d Σ (p −1 , q) for any choice of p, q ∈ S m .Moreover, edit distances are left-invariant, i.e.
Conceptually, left-invariance implies that the edit distance between any pair of permutations p, q is invariant under an arbitrary relabeling of p, q-as long as the relabeling is consistent.Thus, the following identity always holds: where ι = [m], the identity permutation.Suppose we are given two permutations p, q ∈ S n and we seek to compute LCS(p, q).Consider the permutation p ′ = q −1 • p.Since the LCS distance is a valid edit distance, if |LCS(p, q)| = k, then |LCS(p ′ , ι)| = k as well.Notice that ι is strictly increasing, and that any common subsequence p ′ has with ι must also be strictly increasing.The optimality of d follows from the optimality of the well-studied LCS problem [31].
For any pair of general string inputs of sizes n and m, respectively, the LCS between them is computable in O(mn) with dynamic programming, and there is substantial evidence that the complexity cannot be much lower than this [1].
In our setting, however, we interpret the simplexwise filtrations (K, f ), (K, f ′ ) as permutations of the same underlying complex K-in this setting, d LCS reduces further to the permutation edit distance problem.This special type of edit distance has additional structure to it, which we demonstrate below.Proof.By the same reduction from Proposition 3, the problem of computing LCS(f, f ′ ) reduces to the problem of computing the longest increasing subsequence (LIS) of a particular permutation p ′ ∈ S m , which can be done in O(m log log m) time using van Emde Boas trees [8] Reduced complexity is not the only immediate benefit from Proposition 3; by the same reduction to the LIS problem, we obtain the worst-case bounds on S in expectation.Corollary 2. If (K, f ), (K, f ′ ) are random filtrations of a common complex K of size m, then the expected size of a longest common subsequence Proof.The proof of this result reduces to showing the average length of the LIS for random permutations.Let L(p) ∈ [1, m] denote the maximal length of a increasing subsequence of p ∈ S m .The essential quantity to show the expected length of L(p) over all permutations: A large body of work dates back at least 50 years has focused on estimating this quantity, which is sometimes called the Ulam-Hammersley problem.Seminal work by Baik et al. [4] established that as m → ∞: where c = −1.77108....Moreover, letting m → ∞, we have: Thus, if p ∈ S m denotes a uniformly random permutation in S m , then L(p)/ √ m → 2 in probability as m → ∞.Using the reduction from above to show that LCS(p, q) ⇔ LIS(p ′ ), the claimed bound follows.
Remark 2. Note the quantity from Corollary 2 captures the size of S * between pairs of uniformly sampled permutations, as opposed to uniformly sampled filtrations, which have more structure due to the face poset.However, Boissonnat [11] prove the number of distinct filtrations built from a k-dimensional simplicial complex K with m simplices and t distinct filtration values is at least ⌊ t+1 k+1 ⌋ m .Since this bound grows similarly to m! when t ∼ O(m) and k << m fixed, d ≈ n − √ n is not too pessimistic a bound between random filtrations.
In practice, when one has a time-varying filtration and the sampling points are relatively close [in time], the LCS between adjacent filtrations is expected to be much larger, shrinking d substantially.For example, for the complex from Section 1.2 with m = 417 simplices, the average size of the LCS across the 10 evenly spaced filtrations was 343, implying d ≈ 70 permutations needed on average to update the decomposition between adjacent time points.
We conclude this section with the main theorem of this effort: an output-sensitive bound on the computation of persistence dynamically.Theorem 1.Given a pair of filtrations (K, f ), (K, f ′ ), a decomposition R = DV of K, and a sequence S = (s 1 , s 2 , . . ., s d ) of cyclic 'move' permutations s k = (i k , j k ) satisfying i k < j k for all k ∈ [d], computing the updates: where

Constructing schedules
While it is clear from the proof in Proposition 3 that one may compute the LCS between two permutations p, q ∈ S m in O(m log log m) time, it is not immediately clear how to obtain a sorting p → q from a given L = LCS(p, q) efficiently.We outline below a simple procedure which constructs such a sorting S = (s 1 , . . .Recall that a sorting S with respect to two permutations p, q ∈ S m is an ordered sequence of permutations S = (s 1 , s 2 , . . ., s d ) satisfying q = s d • . . .s 1 • p.By definition, a subsequence in L common to both p and q satisfies: where p −1 (σ) (resp.q −1 (σ)) denotes the position of σ in p (resp.q).Thus, obtaining a sorting p → q of size d = m − |L| reduces to applying a sequence of moves in the complement of L. Formally, we define a permutation s ∈ S m as a valid operation with respect to a fixed pair p, q ∈ S m if it satisfies: The problem of constructing a sorting S of size d thus reduces to choosing a sequence of d valid moves, which we call a valid sorting.To do this efficiently, let U denote an ordered set-like data structure that supports the following operations on elements σ ∈ M from the set M = {0, 1, . . ., m + 1}: Given U, a valid sorting can be constructed by querying and maintaining information about the LCS in U. To see this, suppose U contains the current LCS between two permutations p and q.By definition of the LCS, we have: for every σ ∈ U. Now, suppose we choose some element σ / ∈ U which we would like to add to the LCS.If p −1 (σ) < p −1 (U pred (σ)), then we must move σ to the right of its predecessor in p such that (13) holds.Similarly, if p −1 (U succ (σ)) < p −1 (σ), then we must move σ left of its successor in p.Assuming the structure U supports all of the above operations in O(log m) time, we easily deduce a O(dm log m) algorithm for obtaining a valid sorting.

Minimizing schedule cost
The algorithm outlined in section 3.3 is a sufficient for generating move schedules of minimal cardinality: any schedule of moves S sorting f → f ′ above is guaranteed to have size |S| = m − |LCS(f, f ′ )|, and the reduction to the permutation edit distance problem ensures this size is optimal.However, as with the vineyards algorithm, certain pairs of simplices cost more to exchange depending on whether they are critical pairs in the sense described in [22], resulting in a large variability in the cost of randomly generated schedules.This variability is undesirable in practice: we would like to generate a schedule which not only small in size, but is also efficient in terms of its required column operations.

Greedy approach
Ideally, we would like to minimize the cost of a schedule S ∈ Φ Σ (p, q, d) directly, which recall is given by the number of non-zeros at certain entries in R and V : where |I| + |J| are the quantities from Proposition 1. Globally minimizing the objective (14) directly is difficult due to the changing sparsity of the intermediate matrices R k , V k .One advantage of the moves framework is that the cost of a single move on a given R = DV decomposition can be determined efficiently prior to any column operations.Thus, it is natural to consider whether one could minimize ( 14) by greedily choosing the lowest cost move in each step.Unfortunately, not only does this approach does not yield a minimal cost solution, we give a counter-example in the appendix (A.2) demonstrating such a greedy procedure may lead to arbitrarily bad behavior.

Proxy objective
In light of section 3.4.1,we seek an alternative objective that correlates with ( 14) and does not depend on the entries in the decomposition.Given a pair of filtrations (K, f ), (K, f ′ ), a natural schedule S ∈ Φ(f, f ′ , d) of cyclic permutations (i 1 , j 1 ), (i 2 , j 2 ), . . ., (i d , j d ) is one minimizing the upper bound: Unfortunately, even obtaining an optimal schedule S * ∈ Φ(f, f ′ , d) minimizing (15) does not appear tractable due to its similarity with the k-layer crossing minimization problem, which is NP-hard for k sets of permutations when k ≥ 4 [9].For additional discussion on the relationship between these two problems, see section A.3 in the appendix.
In light of the discussion above, we devise a proxy objective function based on the Spearman distance [25] which we observed is both efficient to optimize and effective in practice.The Spearman footrule distance F (p, q) between two p, q ∈ S m is an ℓ 1 -type distance for measuring permutation disarrangement: Like K τ , F forms a metric on S m and is invariant under relabeling.Our motivation for considering the footrule distance is motivated by the fact that F recovers cost(S) when (p, q) differ by a cyclic permutation (i.e.F (p, m ij •p) = 2|i − j|) and by its usage on similar10 combinatorial optimization problems.To adapt F to sortings, we decompose F additively via the bound: where ŝi = s i • • • • • s 2 • s 1 denotes the composition of the first i permutations of a sorting S = (s 1 , . . ., s d ) that maps p → ι and s 0 = ι.As a heuristic to minimize (17), we greedily select the optimal k ∈ L at each step which minimizes the Spearman distance to the identity permutation: Note that equality between F and FS (17) is achieved when the displacement of each σ ∈ p between its initial position in p to its value is non-increasing with every application of s i , which in general not guaranteed using the schedule construction method in section 3. To build intuition for how this heuristic interacts with Algorithm 5, we show a purely combinatorial example below.
Example: The permutation p ∈ S m to sort to the identity p → ι is given as a sequence (4, 2, 7, 1, 8, 6, 3, 5, 0) and a precomputed LIS L = (1, 3, 5), which is highlighted in red.On the left, a table records permuted sequences given on rows i ∈ {0, 1, . . ., 6}, for the moved elements (7, 0, 8, 4, 6, 2) (highlighted in yellow).In the middle, we show the Spearman distance cost (18) associated with both each permuted sequence (left column) and with each candidate permutation m ij • p (rows).Note that elements σ ∈ L (red) induce identity permutations that do not modify the cost, and thus the minimization is restricted to elements σ ∈ p \ L (black).The right plot shows the connection to the bipartite crossing minimization problem discussed in section 3.4.1.

Heuristic Computation
We now introduce an efficient O(d 2 log m) algorithm for constructing a move schedule that greedily minimizes (17).The algorithm is purely combinatorial, and is motivated by the simplicity of updating the Spearman distance between sequences which differ by a cyclic permutation.
First, consider an array A of size m which provides O(1) access and modification, initialized with the signed displacement of every element in p to its corresponding position in q.In the case where q = ι, note A(i) = i − p(i), thus F (p, ι) is given by the sum of the absolute values of the elements in A. In other words, computing F (m ij • p, ι) in O(log m) time corresponds to updating an array's prefix sum under element insertions and deletions, which is easily solved by many data structures (e.g.segment trees).
Because the values of A are signed, the reduction to prefix sums is not exact: it is not immediately clear how to modify |i − j| elements in O(log m) time.To address this, observe that at any point during the execution of Algorithm 5, a cyclic permutation changes each value of A in at most three different ways: Thus, A may be partitioned into at most four contiguous intervals, each upon which the update is constant.Such updates are called range updates, and are known to require O(log m) time using e.g. an implicit treap data structure [10].
Since single element modifications, deletions, insertions, and constant range updates can all be achieved in O(log m) expected time with such a data structure, we conclude that equation ( 18) may be solved in just O(d 2 log m) time.We give an example of this partitioning in Figure 3.

Video data
A common application of persistence is characterizing topological structure in image data.Since a set of "snapshot" frames of a video can be equivalently thought of as discrete 1-parameter family, our framework provides a natural extension of the typical image analysis to video data.To demonstrate the benefit of scheduling and the scalability of the greedy heuristic, we perform two performance tests on the video data from section 1.2: one to test the impact of repairing the decomposition less and one to measure the asymptotic behavior of the greedy approach.
In the first test, we fix a grid size of 9 × 9 and record the cumulative number of column operations needed to compute persistence dynamically across 25 evenly-spaced time points using a variety of scheduling strategies.The three strategies we test are the greedy approach from section 3.4.2, the "simple" approach which uses upwards of O(m) move permutations via selection sort, and a third strategy which interpolates between the two.To perform this interpolation, we use a parameter α ∈ [0, 1] to choose m − α • d random simplices to move using the same construction method outlined in section 3.3.The results are summarized in the left graph on Figure 4, wherein the mean schedule cost of the random strategies are depicted by solid lines.To capture the variation in performance, we run 10 independent iterations and shade the upper and lower bounds of the schedule costs.As seen in Figure 4, while using less move operations (lower α) tends to reduce column operations, constructing random schedules of minimal size is no more competitive than the selection sort strategy.This suggests that efficient schedule construction needs to account for the structure of performing several permutations in sequence, like the greedy heuristic we introduced, to yield an adequate performance boost.
In the second test, we measure the asymptotics of our greedy LCS-based approach.To do this, we generated 8 video data sets again of the expanding annulus outlined in section 1.2, each of increasing grid sizes of 5 × 5, 6 × 6, . . ., 12 × 12.For each data set, we compute persistence over the duration of the video, again testing five evenly spaced settings of α ∈ [0, 1]-the results are shown in the right plot of Figure 4. On the vertical axis, we plot the total number of column operations needed to compute persistence across 25 evenly-spaced time points as a ratio of the data set size (m); we also show the regression curves one obtains for each setting of α.As one can see from the Figure, the cost of using the greedy heuristic tends to increase sub-linearly as a function of the data set size, suggesting the move scheduling approach is indeed quite scalable.Moreover, schedules with minimal size tended to be cheaper than otherwise, confirming our initial hypothesis that repairing the decomposition less can lead to substantial reductions at run-time.

Crocker stacks
There are many challenges to characterizing topological behavior in dynamic settings.One approach is to trace out the curves constituting a continuous family of persistence diagrams in R 3 -the vineyards approach-however this visualization can be cumbersome to work with as there are potentially many such vines tangled together, making topological critical events with low persistence difficult to detect.Moreover, the vineyards visualization does not admit a natural simplification utilizing the stability properties of persistence, as individual vines are not stable: if two vines move near each other and then pull apart without touching, then a pairing in their corresponding persistence diagrams may cross under a small perturbation, signaling the presence of an erroneous topological critical event [45,47].Acknowledging this, Topaz et al. [45] proposed the use of a 2-dimensional summary visualization, called a crocker11 plot.In brief, a crocker plot is a contour plot of a family of Betti curves.Formally, given a filtration p is defined as the ordered sequence of p-th dimensional Betti numbers: Given a time-varying filtration K(τ ), a crocker plot displays changes to β • p (τ ) as a function of τ .An example of a crocker plot generated from the simulation described below is given in Figure 5. Since only the Betti numbers at each simplex in the filtration are needed to generate these Betti curves, the persistence diagram is not directly needed to generate a crocker plot; it is sufficient to use e.g.any of the specialized methods discussed in 1.1.This dependence only on the Betti numbers makes crocker plots easier to compute than standard persistence, however what one gains in efficiency one loses in stability; it is known that Betti curves are inherently unstable with respect to small fluctuations about the diagonal of the persistence diagram.
Xian et al. [47] showed that crocker plots may be smoothed to inherit the stability property of persistence diagrams and reduce noise in the visualization.That is, when applied to a time-varying persistence module M = {M t } t∈[0,T ] , an α-smoothed crocker plot for α ≥ 0 is the rank of the map M t (ϵ − α) → M t (ϵ + α) at time t and scale ϵ.For example, the standard crock plot is a 0-smoothed crocker plot.Allowing all three parameters (t, ϵ, α) to vary continuously leads to 3D visualization called an α-smoothed crocker stack.
Definition 2 (crocker stack).A crocker stack is a family of α-smoothed crock plots which summarizes the topological information of a time-varying persistence module M via the f Note that, unlike crocker plots, applying this α smoothing efficiently requires the persistence pairing.Indeed, it has been shown that crocker stacks and stacked persistence diagrams (i.e.vineyards) are equivalent to each other in the sense that either one contains the information needed to reconstruct the other [47].Thus, computing crocker stacks reduces to computing the persistence of a (time-varying) family of filtrations.
To illustrate the applicability of our method, we test the efficiency of computing these crocker stacks using a spatiotemporal data set.Specifically, we ran a flocking simulation similar to the one run in [45] with m = 20 vertices moving around on the unit square equipped with periodic boundary conditions (i.e. S 1 × S 1 ).We simulated movement by equipping the vertices with a simple set of rules which control how the individual vertices position change over time.Such simulations are also called boid simulations, and they have been extensively used as models to describe how the evolution of collective behavior over time can be described by simple sets of rules.The simulation is initialized with every vertex positioned randomly in the space; the positions of vertices over time is updated according to a set of rules related to the vertices' acceleration, distance to other vertices, etc.To get a sense of the time domain, we ran the simulation until a vertex made at least 5 rotations around the torus.
Given this time-evolving data set, we computed the persistence diagram of the Rips filtration up to ϵ = 0.30 at 60 evenly spaced time points using three approaches: the standard algorithm pHcol applied naively at each of the 60 time steps, the vineyards algorithm applied to (linear) homotopy connecting filtrations adjacent in time, and our approach using moves.The cumulative number of O(m) column operations executed by three different approaches.Note again that vineyards requires generating many decompositions by design (in this case, ≈ 1.8M ).The standard algorithm pHcol and our move strategy were computed at 60 evenly spaced time points.As depicted in Figure 6, our moves strategy is far more efficient than both vineyards and the naive pHcol strategies.
Figure 6: On the left, the cumulative number of column operations (log-scale) of the three baseline approaches tested.On the right, the normalized K τ between adjacent filtrations depicts the coarseness of the discretization-about 5% of the ≈ O(m 2 ) simplex pairs between adjacent filtrations are discordant.

Multiparameter persistence
Given a procedure to filter a space in multiple dimensions simultaneously, a multifiltration, the goal of multi-parameter persistence is to identify persistent features by examining the entire multifiltration.Such a generalization has appeared naturally in many application contexts, showing potential as a tool for exploratory data analysis [37].Indeed, one of the drawbacks of persistence is its instability with respect to strong outliers, which can obscure the detection of significant topological structures [14].One exemplary use case of multi-parameter persistence is to detect these strong outliers by filtering the data with respect to both the original filter function and density.In this section, we show the utility of scheduling with a real-world use case: detecting the presence of a low-dimensional topological space which well-approximates the distribution of natural images.As a quick outline, in what follows we briefly recall the fibered barcode invariant 4.3.1,summarize its potential application to a particular data set with known topological structure 4.3.2, and conclude with experiments of demonstrating how scheduling enables such applications 4.3.3.

Fibered barcode
Unfortunately, unlike the one-parameter case, there is no complete discrete invariant for multi-parameter persistence.Circumventing this, Lesnick et al [35] associate a variety of incomplete invariants to 2-parameter persistence modules; we focus here on the fibered barcode invariant, defined as follows: Definition 3 (Fibered barcode).The fibered barcode B(M ) of a 2D persistence module M is the map which sends each line L ⊂ R 2 with non-negative slope to the barcode B L (M ): Equivalently, B(M ) is the 2-parameter family of barcodes given by restricting M to the of set affine lines with nonnegative slope in R 2 .
Although an intuitive invariant, it is not clear how one might go about computing B(M ) efficiently.One obvious choice is fix L via a linear combination of two filter functions, restrict M to L, and compute the associated 1-parameter barcode.However, this is an O(m 3 ) time computation, which is prohibitive for interactive data analysis purposes.
Utilizing the equivalence between the rank and fibered barcode invariants, Lesnick and Wright [35] developed an elegant way of computing B(M ) via a re-parameterization using standard point-line duality.This clever technique Figure 7: Bipersistence example on an 8 × 8 coarsened grid.On the left, the input data, colored by density.In the middle, the bigraded Betti numbers β 0 (M ) and β 1 (M ) (green and red, respectively), the dimension function (gray), and a line L emphasizing the persistence of features with high density.On the right, the line arrangement A(M ) lying in the dual space derived from the β(M ).
effectively reduces the fibered barcode computation to a sequence of 1-D barcode computations at "template points" lying within the 2-cells of a particular planar subdivision A(M ) of the half-plane [0, ∞) × R.This particular subdivision is induced by the arrangement of "critical lines" derived by the bigraded Betti numbers β(M ) of M .As the barcode of one template point T e at the 2-cell e ∈ A(M ) may be computed efficiently by re-using information from an adjacent template point T e ′ , [35] observed that computing the barcodes of all such template points (and thus, B(M )) may be reduced to ordering the 2-cells in A(M ) along an Eulerian path traversing the dual graph of A(M ).The full algorithm is out of scope for this effort; we include supplementary details for the curious reader in the appendix A.4.
Example 4.1: Consider a small set of noisy points distributed around S 1 containing a few strong outliers, as shown on the left side of Figure 7. Filtering this data set with respect to the Rips parameter and the complement of a kernel density estimate yields a bifiltration whose various invariants are shown in the middle figure.The gray areas indicate homology with positive dimension-the lighter gray area dim 1 (M ) = 1 indicates a persistent loop was detected.On the right side, dual space is shown: the black lines are the critical lines that form A(M ), the blue dashedlines the edges of the dual graph of A(M ), the rainbow lines overlaying the dashed-lines form the Eulerian path, and the orange barycentric points along the 2-cells of A(M ) represent where the barcodes templates T e are parameterized.Despite its elegance, there are significant computational barriers prohibiting the 2-parameter persistence algorithm from being practical.An analysis from [35] (using vineyards) shows the barcodes template computation requires on the order of O(m 3 κ + mκ 2 log κ) elementary operations and O(mκ 2 ) storage, where κ is a coarseness parameter.Since the number of 2-cells in A(M ) is on the order O(κ 2 ), and κ itself is on the order of O(m 2 ) in the worst case, the scaling of the barcode template computation may approach ≈ O(m 5 )-this is both the highest complexity and most time-intensive sub-procedure the RIVET software [44] depends on.Despite this significant complexity barrier, in practice the external stability result from [33] justifies the use of a grid-like reduction procedure which approximates the module M with a smaller module M ′ , enabling practitioners to restrict the size of κ to a relatively small constant.This in-turn dramatically reduces the size of A(M ) and thus the number of barcode templates to compute.Moreover, the ordering of barcode templates given by the dual graph traversal implies that adjacent template points should be relatively close-so long as κ is not too small-suggesting adjacent templates may productively share computations due to the high similarity of their associated filtrations.Indeed, as algorithm 3 was designed for precisely such a computation, 2-parameter persistence is prototypical of the class of methods that stand to benefit from moves.

Natural images dataset
A common hypothesis is that high dimensional data tend to lie in the vicinity of an embedded, low dimensional manifold or topological space.An exemplary demonstration of this is given in the analysis by Lee et al. [34], who explored the space of high-contrast patches extracted from Hans van Hateren's [28] still image collection 12 , which consists of ≈ 4, 000 monochrome images depicting various areas outside Groningen (Holland).In particular, [34] were interested in exploring how high-contrast 3 × 3 image patches were distributed, in pixel-space, with respect to predicted spaces and manifolds.Formally, they measured contrast using a discrete version of the scale-invariant Dirichlet semi-norm: where D is a fixed matrix whose quadratic form x T Dx applied to an image x ∈ R 9 is proportional to the sum of the differences between each pixels 4 connected neighbors (given above by the relation i ∼ j).Their research was primarily motivated by discerning whether there existed clear qualitative differences in the distributions of patches extracted from images of different modalities, such as optical and range images.By mean-centering, contrast normalizing, and"whitening" the data via the Discrete Cosine Transform (DCT), they show a convenient basis for D may be obtained via an expansion of 8 certain non-constant eigenvectors, shown below: Since these images are scale-invariant, the expansion of these basis vectors spans the 7-sphere, S 7 ⊂ R 8 .Using a Voronoi cell decomposition of the data, their distribution analysis suggested that the majority of data points concentrated in a few high-density regions.
In follow-up work, Carlsson et al. [16] found-using persistent homology-that the distribution of high-contrast 3 × 3 patches is actually well-approximated by a Klein bottle M-around 60% of the high-contrast patches from the still image data set lie within a small neighborhood around M accounting for only 21% of the 7-sphere's volume.Along a similar vein, Perea [42] established a dictionary learning framework for efficiently estimating the distribution of patches from texture images, prompting applications for persistent homology in sparse coding contexts.
If one was not aware of the analysis done by [34,28,16,42], it is not immediately clear a priori that the Klein bottle model is a good candidate for capturing the non-linearity of image patches.Indeed, armed with a refined topological intuition, Carlsson still needed to perform extensive sampling, preprocessing, and model fitting techniques in order to reveal the underlying the topological space with persistent homology [16].One reason such preprocessing is needed is due to persistent homology's aforementioned instability with respect to strong outliers.In the ideal setting, a multiparameter approach that accounts for the local density of points should require far less experimentation.
To demonstrate the benefit of 2-parameter persistence on the patch data, consider the (coarsened) fibered barcode computed from a standard Rips / codensity bifiltration on a representative sample of the image data from [28], shown in Figure 8. From the bigraded Betti number and the dimension function, one finds that a large area of the dimension function is constant (highlighted as the blue portion in the middle of Figure 8), wherein the first Betti number is 5.Further inspection suggests one plausible candidate is the three-circle model C 3 , which consists of three circles, two of which (say, S v and S h ) intersect the third (say, S lin ) in exactly two points, but themselves do not intersect.Projecting the image data onto the first two basis vectors from the DCT shown above leads to the projection shown in the top left of Figure 8, of which 15 landmark points are also shown.Observe the data are distributed well around three "circles"-the outside circle capturing the rotation gradient of the image patches (S lin ), and the other two capturing the vertical and horizontal gradients (S v and S h , respectively).Since the three circle model is the 1-skeleton of the Klein bottle, one may concur with Carlssons analysis [16] that the Klein bottle may be a reasonable candidate upon which the image data are distributed.
The degree to which multi-parameter persistence simplifies this exploratory phase cannot be understated: we believe multi-parameter persistence has a larger role to play in manifold learning.Unfortunately, as mentioned prior, the compute barriers effectively bar its use in practice.

Accelerating 2D persistence
Having outlined the computational theory of 2-parameter persistence, we now demonstrate the efficiency of moves using the same high-contrast patch data set studied in [34] by evaluating the performance of various methods at computing the fibered barcode invariant via the parameterization from A.4.
Due to the aforementioned high complexity of the fibered barcode computation, we begin by working with a subset of the image patch data X .We combine the use of furthest-point sampling and proportionate allocation (stratified) sampling to sample landmarks X ⊂ X distributed within n = 25 strata.Each stratum consists of the (1/n)-thick level set given the k-nearest neighbor density estimator ρ 15 with k = 15.The use of furthest-point sampling gives us certain coverage guarantees that the geometry is approximately preserved within each level set, whereas the stratification ensures the original density of is approximated preserved as well.From this data set, we construct a Rips-(co)density bifiltration using ρ 15 equipped with the geodesic metric computed over the same k-nearest neighbor graph on X.
Finally, we record the number of column reductions needed to compute the fibered barcode at a variety of levels of coarsening using pHcol, vineyards, and our moves approach.The results are summarized in Table 1.We also record the number of 2-cells in A(M ) and the number of permutations applied encountered along the traversal of the dual graph for both vineyards and moves, denoted in the table as d K and d LCS , respectively.As shown on the table, when the coarsening κ is small enough, we're able to achieve a significant reduction in the number of total column operations needed to compute T compared to both vineyards and pHcol.This is further reinforced by the observation that vineyards is particularly inefficient when then 1-parameter family is coarse.Indeed, moves requires about 3x less column operations than naively computing T independently.However, note that as the coarsening becomes more refined and more 2-cells are added to A(M ), vineyards becomes a more viable option compared to pHcol-as the asymptotics suggests-though even at the highest coarsening we tested the gain in efficiency is relatively small.In contrast, moves scales quite well with this refinement, requiring about 12% and ≈ 5% of the number of column operations as vineyards and pHcol, respectively.

Conclusion and Future Work
In conclusion, we presented a scheduling algorithm for efficiently updating a decomposition in coarse dynamic settings.Our approach is simple, relatively easy to implement, and fully general: it does not depend on the geometry of underlying space, the choice of triangulation, or the choice of homology dimension.Moreover, we supplied efficient algorithms for our scheduling strategy, provided tight bounds where applicable, and demonstrated our algorithms performance with several real world use cases.
There are many possible applications of our work beyond the ones discussed in section 4, such as e.g.accelerating PH featurization methods or detecting homological critical points in dynamic settings.Indeed, we see our approach as potentially useful to any situation where the structure of interest can be cast as a parameterized family of persistence diagrams.Areas of particular interest include time-series analysis and dynamic metric spaces [30].
The simple and combinatorial nature of our approach does pose some limitations to its applicability.For example, better bounds or algorithms may be obtainable if stronger assumptions can be made on how the filtration is changing with time.Moreover, if the filtration (K, f ) shares little similarity to the "target" filtration (L, f ′ ), then the overhead of reducing the simplices from L \ K appended to the decomposition derived from K may be large enough to motivate simply computing the decomposition at L independently.Our approach is primarily useful if the filtrations in the parameterized family are "nearby" in the combinatorial sense.
From an implementation perspective, one non-trivial complication of our approach is its heavy dependence on a particular sparse matrix data structure, which permits permuting both the row and columns of a given matrix in at most O(m) time [22].As shown with the natural images example in section 4, there are often more permutation operations being applied than there are column reductions.In the more standard compressed sparse matrix representations, permuting both the rows and columns generally takes at most O(Z) time, where Z is the number of non-zero entries, which can be quite expensive if the particular filtration has many cycles.As a result, the more complex sparse matrix representation from [22] is necessary to be efficient in practice.
Moving forward, our results suggest there are many aspects of computing persistence in dynamic settings yet to be explored.For example, it's not immediately clear whether one could adopt, for example, the twist optimization [18] used in the reduction algorithm to the dynamic setting.Another direction to explore would be the analysis of our approach under the cohomology computation [23], or the specialization of the move operations to specific types of filtrations such as Rips filtrations.Such adaptations may result in even greater reductions in the number of column operations, as have been observed in practice for the standard reduction algorithm [5].Moreover, though we have carefully constructed an efficient greedy heuristic in section 3.4.2and illustrated a different perspective with which to view our heuristic (via crossing minimization), it is an open question whether there exists a more structured reduction of ( 14) or ( 17) to a better-known problem.

Declarations
Ethical Approval: Not applicable.
Competing interests: The authors declare that they have no competing interests that could influence the interpretation or presentation of the research findings.There are no financial or personal relationships with individuals or organizations that could bias the outcomes of this work.

Authors' contributions:
The contributions of each author to this article were as follows: (R, V ) ← (D, I) 3: for j = 1 to m do 4: subsequently modified in-place.After V is set to the identity, the algorithm proceeds with column operations on both is tight [39], though this seems to only be true on pathological inputs.A more refined analysis by Edelsbrunner et al. [27] shows the reduction algorithm scales by the sum of squares of the cycle persistences.
Move Left: Though conceptually similar, note that there is an asymmetry between MoveRight and MoveLeft: moving a simplex upwards in the filtration requires removing non-zero entries along several columns of a particular row in V so that the corresponding permutation does not render V non-upper triangular.The key insight of the algorithm presented in [15] is that R can actually be maintained in all but one column during this procedure (by employing the donor column).In contrast, moving a simplex to an earlier time in the filtration requires removing non-zero entries along several rows of a particular column of V .As before, though R stays reduced during this cancellation procedure in all but one column, the subsequent permutation to R requires reducing a pair of columns which may cascade into a larger chain of column operations to keep R reduced.This is due to the fact that higher entries in columns in R (above the pivot entry) may very well introduce additional non-reduced columns after R is permuted.Since these operations always occur in a left-to-right fashion, it is not immediately clear how to apply a donor column kind of concept.Fortunately, like move right, we can still separate the algorithm into a reduction and restoration phase-see Algorithm 2.Moreover, since R is reduced in all but one column by line 6 in Algorithm 2, we can still guarantee the number of low entries to reduce in R will be at most |i − j|.For a supplementary description of the move algorithm, see [15].

LCS-Sort:
The algorithm to construct a schedule from a given LIS is simple enough to derive using the rules discussed in section 3.3 (namely, equation ( 12), but nonetheless for posterity's sake we record it here for the curious reader; the pseudocode is given in Algorithm 5.
The high level idea of the algorithm is to first construct the LCS between two permutations p, q ∈ S m and then apply cyclic permutations successively to p, each step adding a new element extending the LCS.The algorithmic steps are as follows: first, one re-labels q → ι to the identity permutation ι = [m] and applies a consistent re-labeling p → p.This relabeling preserves the LCS distance and has the additional advantage that q = ι = [m] is a strictly increasing subsequence, and thus computing the LCS between p, q ∈ S m reduces to computing the LIS L of p.Given L computed from p, since L is strictly increasing, the only elements left to permute are in L \ p, which we denote with D. After choosing any σ ∈ D, one applies a cyclic permutation to p that moves σ to any position that increases the size of L, continuing this way until the identity sequence ι is recovered.By sorting p → ι by operations that (strictly) increase the size of L, we ensure that the size of the corresponding schedule is exactly m − |L|.
To build the schedule of permutations efficiently, we use a set-like data structure U that supports efficient querying the successor and predecessor of any given s ∈ p with respect to L (such as a vEB tree).The simplified pseudocode also uses the inverse permutation p to query the position of a given element σ ∈ p, although a more efficient representation return S can be used via an implicit treap of the displacements array; see Section 3.4.2.After σ is inserted into L, we update p, its inverse permutations p−1 , D and U prior to the next move.The final set of permutations to sort p → ι (or equivalently, p → q) are stored in an array S, which is then returned for further use.

A.2 Greedy Counter Example
In this section we give a simple counter-example showing that the strategy that greedily chooses valid move permutations {m ij } minimizing the quantity cost RV (m ij ) =  move in succession would begin by moving x or z first, since these are the cheapest moves available, which implies one of S 1 , S 2 , S 5 , S 6 would be picked on the first iteration.While the cheapest schedule S 1 is in this candidate set, an iterative greedy procedure would pick either S 2 or S 5 , depending on the tie-breaker-thus, a greedy approach picking the lowest-cost move may not yield an optimal schedule.Indeed, as the most expensive schedule S 6 is in The set of critical lines crit(M ) with respect to some fixed set S ⊂ R 2 fully characterizes a certain planar subdivision of the half plane [0, ∞) × R.This planar subdivision, denoted by A(M ), is thus entirely determined by S under point line duality.A corollary from [35] shows that if the duals of two lines L, L ′ ∈ L are contained in the same 2-cell in A(M ), then S L = S L ′ , i.e. the partitions induced by push L are equivalent.Indeed, the total order on S L is simply the pullback of the total order on L with respect to the push map.Since A(M ) partitions the entire half-plane, the dual to every line L ∈ L is contained within A(M )-the desired reparameterization.
To connect this construction back to persistence, one requires the definition of bigraded Betti numbers.For our purposes, the i th -graded Betti number of M is simply a function β i (M ) : R 2 → N whose values indicate the number of elements at each degree in a basis of the i th module in a free resolution for M -the interested reader is referred to [35,17] for a more precise algebraic definition.Let S = supp β 0 (M ) ∪ supp β 1 (M ), where the functions β 0 (M ), β 1 (M ) are 0 th and 1 st bigraded Betti numbers of M , respectively.The main mathematical result from [35]  Computing ( 1) takes approximately ≈ O(m 3 ) using a matrix algorithm similar to Algorithm 4 [36].Constructing and storing the line arrangement A(M ) with n lines and k vertices is related to the line segment intersection problem, which known algorithms in computational geometry can solve in (optimal) output-sensitive O((n+k) log n) time [12].
In terms of space complexity, the number of 2-cells in A(M ) is upper bounded by O(κ 2 ), where κ is a coarseness parameter associated with the computation of β(M ).
There are several approaches one can use to compute T , the simplest being to run Algorithm 4 independently on the 1-D filtration induced by the duals of some set of points (e.g. the barycenters) lying in the interior of the 2-cells of A(M ).The approach taken by [35] is to use the R = DV decomposition computed at some adjacent 2-cell e ∈ A(M ) to speed up the computation of an adjacent cell e ′ ∈ A(M ).More explicitly, define the dual graph of A(M ) to be the undirected graph G which has a vertex for every 2-cell e ∈ A(M ) and an edge for each adjacent pair of cells e, e ′ ∈ A(M ).Each vertex in G is associated with a barcode template T e , and the computation of T now reduces to computing a path Γ on G which visits each vertex at least once.To minimize the computation time, assume the n edges of G are endowed with non-negative weights W = w 1 , w 2 , . . ., w n whose values w i ∈ R + represent some notion of distance which is proportional to the computational disparity between adjacent template computations.The optimal path Γ * that minimizes the computation time is then the minimal length path with respect to W which visits every vertex of G at least once.There is a known 3 2 -approximation that can be computed efficiently which reduces the problem to the traveling salesman problem on a metric graph [20], and thus can be used so long as the distance function between templates is a valid metrics.[37] use the Kendall distance between the push-map induced filtrations, but other options are available-for example, any of the combinatorial metrics we studied in Section 3.4.

Figure 1 :
Figure 1: Top: A video of an expanding annulus.Bottom: Sublevel-set filtrations, via negative pixel intensity, of a Freudenthal triangulation of the plane.

Figure 2 :
Figure 2: The cumulative column operations needed to compute persistence across the time-varying filtration of grayscale images.Observe 10 independent persistence computations evenly spaced in time (green line) captures the major topological changes and is the most computationally efficient approach shown.

Proposition 2 .
Let (K, f ) denote a filtration of size |K| = m with decomposition R = DV and let T denote the number of column operations needed by vineyards to perform the sequence of transpositions: (j − 1, j) induces at most 2(|i − j|) column operations in both R and V , giving a total of 4(|i − j|) column operations.Now consider MoveRight(i,j), outlined in Algorithm 1.Here, the dominant cost again are the column operations (line 5).Though we need an extra O(m) storage allocation for the donor columns d * prior to the movement, notice that assignment to and from d * (lines (4), and (7) in RestoreLeft and MoveRight, respectively) requires just O(1) time via a pointer swapping argument.That is, when d ′ low < d low , instead of copying col * (k) to d ′ * -which takes O(m) time-we instead swap their column pointers in O(1) prior to column operations.After the movement, d * contains the newly modified column and col * (k) contains the unmodified donor d ′ * , so the final donor swap also requires O(1) time.Since at most one O(m) column operation is required for each index in [i, j], moving a column from i to j where i < j requires at most 2(|i − j|) column operations for both R and V .The claimed inequality follows.

Corollary 1 .
Let (K, f ), (K, f ′ ) denote two filtrations of size |K| = m, and let S * denote schedule of minimal size re-indexing f → f ′ .Then |S * | = d can be determined in O(m log log m) time.
denotes a valid decomposition of (K, f ′ ) requires O(ν) column operations, where ν depends on the sparsity of the intermediate entriesV 1 , V 2 , . . ., V d and R 1 , R 2 , . . ., R d : ν = d k=1 (|I k | + |J k |),where |I k |, |J k |are given in Proposition 1 Moreover, the size of a minimal such S can be determined in O(m log log m) time and O(m) space.Proof.Proposition 3 yields the necessary conditions for constructing S with optimal size d in O(m log log m) time and O(m).The definition of ν follows directly from Algorithm 1.
, s d ) in O(dm log m) time and O(m) space, or O(m log m) time and O(m) space per update in the online setting.

Figure 3 :
Figure 3: We re-use the previous example of sorting the sequence p = (4, 2, 7, 1, 8, 6, 3, 5, 0) to the identity ι = [m] using a precomputed LIS L = (1, 3, 5).In the left table shown below, we display the same sorting as before, with elements chosen to move highlighted in yellow; on the right, a table showing the entries of A are recorded.

Figure 4 :
Figure 4: Performance comparison between various scheduling strategies.On the left, the cumulative column operations required to compute the 1-parameter family is shown for varying schedule sizes (d) and strategies.On the right, both the size of the schedule and the data set (m) are varied.

Figure 5 :
Figure 5: A Crocker plot (right) depicts the evolution of dimension p = 1 Betti curves over time.The green X marks correspond chronologically to the complexes (left), in row-major order.The large orange and purple areas depict 1-cycles persisting in both space (y-axis) and time (x-axis).

Figure 8 :
Figure 8: Bipersistence example of natural images data set on a 12 × 16 coarsened grid.On the left, a projection of the full data set is shown, along with the 15 landmark patches.(Middle) the bigraded Betti numbers and a fixed line L over parameter space.As before, the 0/1/2 dimension bigraded Betti numbers are shown in green/red/yellow, respectively, with the blue region highlighting where dim(M ) = 5. (Right) five persistent features representing B L (M ) are revealed from the middle, matching β 1 of the three-circle model.
R and V , left to right, until the decomposition invariants are satisfied.Since each column operation takes O(m) and there are potentially O(k) columns in D with identical low entries (line 4 in 4, observe that the reduction algorithm below clearly takes O(m 2 k) time.Since there exist complexes where k ∼ O(m), one concludes the bound of O(m 3 )

j l=i+1 1 ( 1 (3 4 8 9 10 5 6 7
v l (i) ̸ = 0 ) + m l=1 low R (l) ∈ [i, j] and r l (i) ̸ = 0 )can lead to arbitrarily bad behavior.A pair of filtrations is given below, each comprising the 1-skeleton of a 3-simplex.Relabeling (K, f ) to the index set f : K → [m] and modifying (K, f ′ ) accordingly yields the permutations:(K, f ) = {a b c d u v w x y z} = 1 f ′ ) = {a b c d x y z u v w} = 1 2The values colored in red corresponds to LCS(f, f ′ ).For this example, the edit distance is d = m − |LCS(f, f ′ )| implies exactly 3 moves are needed to map f → f ′ .There are six possible valid schedules of moves:S 1 = m xu , m yu , m zu S 3 = m yu , m xy , m zu S 5 = m zu , m xz , m yz S 2 = m xu , m zu , m yz S 4 = m yu , m zu , m xy S 6 = m zu , m yz , m xzwhere the notation m xy represents the move permutation that moves x to the position of y.The cost of each move operation and each schedule is recorded in is a characterization of the barcodes B L (M ), for any L ∈ L, in terms of a set of barcode templates T computed at every 2-cell in A(M ).More formally, for any line L ∈ L and e any 2-cell in A(M ) whose closure contains the dual of L under point-line duality, the 1-parameter restriction of the persistence module M induced by L is given by:B L (M ) = {[ push L (a), push L (b) ) | (a, b) ∈ T e , push L (a) < push L (b)}(21)Minor additional conditions are needed for handling completely horizontal and vertical lines.The importance of this theorem lies in the fact that the fibered barcodes are completely defined from the precomputed barcode templates T -once every barcode template T e has been computed and augmented onto A(M ), B(M ) is completely characterized, and the barcodes B L (M ) associated to a 1-D filtration induced by any choice of L can be efficiently computed via a point-location query on A(M ) and a O(|B L (M )|) application of the push map.Invariant computation: Computationally, the algorithm from[36] can be summarized into three steps:1 Compute the bigraded Betti numbers β(M ) of M 2 Construct a line arrangement A(M ) induced by critical lines from (1) 3 Augment A(M ) with barcode templates T e at every 2-cell e ∈ A(M )

Table 1 :
Cost to computing T for various coarsening choices of β(M ).
The reduction algorithm, outlined in pseudocode in Algorithm 4, is the most widely used modality for computing persistence.While there exists other algorithms for computing persistence, they are typically not competitive with the reduction algorithm in practice.The algorithm begins by copying D to a new matrix R, to be Funding: The research presented in this work is partially supported by the National Science Foundation through grants CCF-2006661 and CAREER award DMS-1943758.The funding source had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Table 2 .
Note the greedy strategy which always selects the cheapest

Table 2 :
Move schedule costs