Distributing Persistent Homology via Spectral Sequences

We set up the theory for a distributed algorithm for computing persistent homology. For this purpose we develop linear algebra of persistence modules. We present bases of persistence modules, together with an operation ⊞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boxplus $$\end{document} that leads to a method for obtaining images, kernels and cokernels of tame persistence morphisms. Our focus is on developing efficient methods for the computation of homology of chains of persistence modules. Later we give a brief, self-contained presentation of the Mayer–Vietoris spectral sequence. Then we study the Persistent Mayer–Vietoris spectral sequence and present a solution to the extension problem. This solution is given by finding coefficients that indicate gluings between bars on the same dimension. Finally, we review PerMaViss, an algorithm that computes all pages in the spectral sequence and solves the extension problem. This procedure distributes computations on subcomplexes, while focusing on merging homological information. Additionally, some computational bounds are found which confirm the distribution of the method.


INTRODUCTION
1.1.Motivation.Persistent homology has existed for about two decades [18].This tool of applied topology has played a central role in applications, such as the study of geometric structure of sets of points lying in R n , see [15,18].This introduced the field of Topological Data Analysis which, very soon, was applied to a multitude of problems, see [6,19] for a survey article and an introduction.Among others, persistent homology has been applied to study coverage in sensor networks [13], pattern detection [26], classification and recovery of signals [27] and it has also had an impact on shape recognition using machine learning techniques, see [1,16].All these applications motivate the need for fast algorithms for computing persistent homology.The usual algorithm used for these computations was introduced in [18], with some later additions to speed up such as those of [8,9,14].In [24] persistent homology is proven to be computable in matrix multiplication time.However, since these matrices become large very quickly, the computations are generally very expensive, both in terms of computational time and in memory required.
In practice computing the persistent homology of a given filtered complex is equivalent to computing its matrices of differentials and perform successive Gaussian eliminations; see [17,18].In recent years, some methods have been developed for the parallelization of persistent homology.The first approach was introduced in [17] as the spectral sequence algorithm, and was successfully implemented in [3].This consists in dividing the original matrix M into groups of rows, and sending these to different processors.These processors will, in turn, perform a local Gaussian Elimination and share the necessary information between them, see [3].On the other hand, a more topological approach is presented in [21].It uses the blow-up complex introduced in [33].This approach first takes a cover C of a filtered simplicial complex K, and uses the result that the persistent homology of K is isomorphic to that of the blow-up complex K C .This proceeds by computing the sparsified persistent homology for each cover, and then use this information to reduce the differential of K C efficiently.Both of these parallelization methods have provided substantial speedups compared to the standard method presented in [18].
Following the ideas on [33], having an understanding of how persistence barcodes relate to a cover can help us obtain better representatives.On this basis, it would be desirable to have a method that leads to the speedups from [3,21], while still keeping cover information from [33].Further, it would also be desirable to drop all restrictions in covers, and consider functional covers such as those used in the mapper algorithm, see [28].This last point limits substantially the use of the blowup-complex, since the number of simplices Álvaro Torras Casas is supported by an EPSRC grant with reference EP/N509449/1.grows very quickly when we allow the intersections to grow.In fact, in the extreme case where a complex K is covered by n copies of K, the blowup complex K C has size 2 n |K|.
1.2.The Persistence Mayer Vietoris spectral sequence and related literature.Since distribution is an important issue in persistent homology, it is worth exploring which classical tools of algebraic topology could be used in this context.A very well-known tool for distributing homology computations is the Mayer-Vietoris spectral sequence, see [10] for a quick introduction to spectral sequences.It is no surprise that these objects work in in this context, since they have been employed for similar problems for a long time, see [4] or [23].Since the category of persistence modules and persistence morphisms is an abelian category, the process of computing a spectral sequence should be more or less straightforward.However, there is always the question of how we implement this in practice.Furthermore, this approach has been already proposed in [22], although without a solution to the extension problem.Later, spectral sequences were used for distributing computations of cohomology groups in a field in [12], and recently in [31] and [32] spectral sequences are used for distributing persistent homology computations.However, all of [12,31,32] assume that the nerve of the cover is one dimensional.
The first problem when dealing with spectral sequences is that we need to be able to compute images, kernels and quotients.Needless to say, these should be computed in an optimal way.This question has already been studied in [11], where the authors give a very efficient algorithm.However, there are couple of problems that come up when using [11] in spectral sequences: (1) In [11] the authors assume that a given morphism is induced by the inclusion X ⊆ Y of two given filtered simplicial complexes.This is not the case in spectral sequences, where the maps in the second, third and higher pages are not induced by a simplicial morphism.Furthermore, even when computing the first page this is not the case.Indeed, the Čech differentials are not inclusions at all, where each simplex is mapped to its copy on different covers.This means that the algorithm in [11] needs to be adapted to our case.(2) A key assumption in [11] is that the filtrations in X and Y are both general.This is a fairly broad premise in cases such as when both X and Y are Vietoris Rips complexes on two point clouds.However, in spectral sequences this hypothesis hardly ever holds.Indeed, this follows from the fact that a simplex might be contained in various overlapping covers.As one can see in table 2 from [11], the authors assume that there are only 6 possible combinations of births and deaths in images, kernels and cokernels.When generality does not hold, the number of cases is arbitrary.
Thus, if we want to compute images, kernels and cokernels, we will need to be able to overcome these two difficulties first.Also, notice that a good solution should lead to the representatives, as these are needed for the spectral sequence.
The other difficulty that one might encounter in spectral sequences comes with the extension problem.That is, once we have computed the spectral sequence, we still need to recompose broken barcodes in order to recover the global persistent homology.Within the context of persistent homology, the extension problem first appeared in section 6 from [20].There the authors give an approximate result that holds in the case of acyclic coverings.This allows them to compare the persistent homology to the lower row of the infinity page in the spectral sequence.This leads to an ε-interleaving between the global persistent homology and that of the filtered nerve.Later, the extension problem appeared in the PhD Thesis of Hee Rhang Yoon [31], and also in the recent joint work with Robert Ghrist [32].In section 4.2.3 from Yoon's Thesis, the author gives a detailed solution for the extension problem in the case when the nerve of the cover is one dimensional.
1.3.Original Contribution.In this paper, we set the theoretical foundations for a distributed method on the input data.In order to do this, we use the algebraic power of the Mayer-Vietoris spectral sequence.Since the aim is to build up an explicit algorithm, we need to develop linear algebra of persistence modules, as done through Section 3. There, we define barcode bases and also we develop an operation that allows us to determine whether a group of barcode vectors are linearly independent or not.This machinery, although it might seem artificial, is the key to understanding what it really means to subtract columns from left to right in the Gaussian elimination outlined in image kernel, see Algorithm 1. Also, it helps us to encapsulate By using the ideas in this text we developed PERMAVISS, a Python3 library that computes the Persistence Mayer-Vietoris spectral sequence.In the results from [29], one can see that nontrivial higher differentials come up and also the extension problem is a fairly frequent phenomenon of nontrivial solution.This supports the idea that the spectral sequence adds more information on top of persistent homology.Finally, we outline future directions, both for the study of the Persistence Mayer Vietoris spectral sequence and future versions of PERMAVISS.
Definition 2.1.Given a set X, a simplicial complex K is a subset of the power set K ⊆ P(X) such that if σ ∈ K, then for all subsets τ ⊆ σ we have that τ ∈ K.An element σ ∈ K will be called a n-simplex whenever |σ | = n + 1, whereas a subset τ ⊆ σ will be called a face.Thus, if a simplex is contained in K all its faces must also be contained in K. Given a simplicial complex K, we denote by K n the set containing all the n-simplices from K. Given a pair of simplicial complexes K and L, if L ⊆ K, then we say that L is a subcomplex of K. Also, given a mapping f : K → L between two simplicial complexes K and L, we call f a simplicial morphism whenever f (K n ) ⊆ n l=0 L l for all n ≥ 0. The category composed of simplicial complexes and simplicial morphisms will be denoted by SpCpx.
Let F be a field.For each n ≥ 0 we define the free vector space over the n-simplices of K as We also consider linear maps where the hat notation, vi , is used to indicate omission of a vertex.Setting S n (K) = 0 for all n < 0 we put all of these in a sequence (2) 0 S 0 (K) It follows from formula (1) that the composition of two consecutive differentials vanishes: d n • d n−1 = 0 for all n ≥ 0. In this case we say that (2) is a chain complex.As a consequence, we have that Im(d n+1 ) ⊆ Ker(d n ), and we can define the homology with coefficients in F to be for all n ≥ 0. In general, F will be understood by the context and the notation H n (K) might be used instead.
On the other hand, we consider the augmentation map ε : S 0 (K) → F defined by the assignement s → 1 F , for any simplex s ∈ S 0 (K).Then, we define the reduced homology by , and H n (K; F) = H n (K; F) for all n > 0. Consider the chain complex S * (K), obtained by augmenting (2) by ε and a copy of F in degree −1: Then one can see that computing reduced homology is the same as computing homology on S * (K).
Definition 2.2 (Standard m-simplex).Given m > 0, we define ∆ m = P({0, 1, . . ., m}), which will be called the standard m-simplex.This leads to a chain complex S * (∆ m ) For each simplex σ ∈ ∆ m , we will use the notation U σ = i∈σ U i .Altogether, we define the nerve of U as the simplicial complex This leads to an augmented chain complex S * (N U ) with differentials denoted by d N U * .In particular, given a simplex σ ∈ N U , we have a simplicial injection f σ : ∆ |σ | → N U .This induces an injection of chain complexes Definition 2.4 ( Čech chain complex).Let K be a simplicial complex and let U = {U i } m i=1 be a cover of K by m subcomplexes.For each simplex s ∈ K, there exists a simplex σ (s) ∈ N U with maximal cardinality |σ (s)|, so that s ∈ U σ (s) .Then, for a fixed degree n ≥ 0, we define the (n, U )-Čech chain complex by For k ≥ −1, we will use the notation (τ) s with s ∈ K n and τ ∈ S k ∆ |σ (s)| , to denote an element in Čk (n, U ; F) that is zero everywhere except for τ in the component indexed by s.Then the image of the Notice that by definition the Čech complex is a chain complex and is exact.Also, one can see that follows easily.On the other hand, for each k ≥ 0 we define an isomorphism by sending (τ) s to (s) τ for any pair of simplices s ∈ K n and τ ∈ f In particular, we can rewrite the (n, U )-Čech chain complex as a sequence where the differentials δ i are chosen in order to commute with the ψ i 's.That is, one has that, for any pair of simplices σ ∈ N U k and s ∈ (U σ ) n , we have equalities , Remark.Alternatively, the Čech chain complex can be defined straight away as the sequence (3).Then, one can see that this is an exact chain complex by using cosheaf theory.Namely, given a simplicial complex K, we consider the topology where the open sets are given by subcomplexes.Then, for each integer n ≥ 0, one has the simplicial precosheaf as an assignement for each subcomplex V ⊆ K.This precosheaf is in fact a flabby cosheaf.Then, using 2.5, 4.3, and 4.4 from section VI. in [5], one has exactness of the Čech chain complex.
2.2.Persistence Modules.Let R be the category of real numbers as a poset, where hom R (s,t) contains a single morphism whenever s ≤ t, and is empty otherwise.Let F be a field and let Vect denote the category of F-vector spaces.Also let vect ⊂ Vect be the subcategory of finite dimensional F-vector spaces.Definition 2.5.A filtered simplicial complex is a functor K : R → SpCpx, such that K s ⊆ K t for any pair s ≤ t in R. Notice that the results from subsection 2.1 also hold for filtered simplicial complexes.Definition 2.6.Let K be a filtered simplicial complex and n ≥ 0. We define the n-persistent homology of K as the composed functor H n (K) : R → Vect.We will also denote this by PH n (K).Definition 2.7.A persistence module V is a covariant functor V : R → Vect.That is, to any r ∈ R, V assigns a vector space in Vect which will be denoted either by V(r) or V r .Additionally, to any pair of real numbers s ≤ t, there is a linear morphism V(s ≤ t) : V s → V t .These morphisms satisfy V(s ≤ s) = Id V s for any s ∈ R, and the relation V(r ≤ t) = V(s ≤ t) • V(r ≤ s) for all r ≤ s ≤ t in R. Given two persistence modules V and W, a morphism of persistence modules is a natural transformation f : V → W. Thus, for any pair of real numbers s ≤ t, there is a commuting square We denote by PMod the category of persistence modules and persistence morphisms.
Hence, whenever we are speaking about the naturality of f we will be referring to the commutative square above.We say that a persistence morphism f : V → W is an isomorphism whenever f t is an isomorphism for all t ∈ R. We write V W to denote that V is isomorphic to W. A pointwise finite dimensional (p.f.d.) persistence module is a functor V : R → vect, where vect is the category of finite vector spaces.Definition 2.8.A sequence of persistence modules and persistence morphisms Example 1.A special class of persistence modules will be the interval modules.For any pair of real numbers s ≤ t, we denote by I(s,t) the interval module (4) I(s,t)(r) = F for r ∈ [s,t), 0 otherwise.
The morphisms I(s,t)(a ≤ b) will be the identity for any two a, b ∈ [s,t) and will be 0 otherwise.
Notice that in an analogous way we could have defined barcodes I(s,t) over intervals of the form [s,t], (s,t] or (s,t), with s ≤ t.For a given interval I(s,t), the values s and t will be called respectively the birth and death values.Whenever V is a p.f.d persistence module, then it can be uniquely decomposed as a direct sum of barcodes i∈J I(s i ,t i ), as shown in [7].This means that there is an isomorphism V i∈J I(s i ,t i ) of persistence modules.This will be called the barcode decomposition of V. Throughout this text, we will mainly be studying persistence modules that decompose into barcodes of the form (4).

HOMOLOGY OF PERSISTENCE MODULES
3.1.Barcode Bases.In this section we will use the result from [7] to introduce barcode bases.Our aim will be to come up with an efficient way of computing homology in this category.At the end we will introduce an algorithm for computing images and kernels, and we will evaluate its computational complexity.Definition 3.1 (Barcode Basis).A barcode basis B of a persistence module V is a choice of an isomorphism, β : i∈I I(a i , b i ) → V.Each direct summand of β defines a restricted morphism from a barcode β i : I(a i , b i ) → V, and will be called a barcode generator.We will usually denote a barcode basis B by the set of barcode generators B = {β i } i∈I .
Within the context of definition 3.1, we would like to make some notational remarks.
• Given a barcode generator β ∈ B, we write β ∼ [a, b) to denote that β is a natural transformation β : I(a, b) → V.In this case we say that β is associated to the interval [a, b).• Notice that if we choose β ∈ B with β ∼ [a, b) and r ∈ R, we have a linear transformation β (r) : I(a, b)(r) → V(r).In particular, since I(a i , b i )(r) is either 0 or F, the morphism β (r) is uniquely determined by the image β (r)(1 F ) ∈ V(r).For the sake of simplicity, we will write β (r) ∈ V(r) instead of β (r)(1 F ) ∈ V(r).• For any given r ∈ R, we define the pointwise basis in r by In this case, if β ∈ B r and β ∼ [a β , b β ), then a β ≤ r < b β by naturality of β .Also, evaluating all the elements from B r on 1 F leads to a vector base B r (1 F ) for V(r).
Remark.We can think of a persistence module V as a sheaf over R, where R is endowed with the topology where the open sets are either the intervals [a, ∞) or (a, ∞), for any a ∈ R. Thus the restriction morphism A barcode base is a set of global sections of the sheaf V, such that they form pointwise base of the vector spaces V r , for all r ∈ R.That is, B ⊂ V forms a barcode base for V if and only if B r forms a base of V r for all r ∈ R.
To make our work less cumbersome, we will only focus on very simple persistence modules.In fact, these modules will be the only ones relevant for our later applications.Definition 3.2.A tame persistence module V, is a p.f.d.persistence module that admits a finite barcode basis B = {β i } 1≤i≤N and all the barcodes β i are associated to an interval of the form Thus, whenever we are speaking about tame persistence modules, we will assume that I(a, b) denotes a barcode over [a, b).The first problem one encounters when working with a barcode basis B = {β i } i∈I is taking linear combinations.Whenever we take a barcode generator β 1 ∈ B we have a natural transformation β 1 : I(a 1 , b 1 ) → V.However, this property does not need to hold for general sums.For example, suppose that β 1 ∼ [0, 2) and β 2 ∼ [1, 3) are two barcode generators from B, then we can define the sum pointwise γ(r) := β 1 (r) + β 2 (r) ∈ V(r) for all r ∈ R.Even though this γ is well defined, this assignment does not define a natural transformation.This is depicted in Figure 1, where we have γ(1) Sum of barcode generators might not be natural.
More generally, assume that is not satisfied for some r ≤ s in R. Something that we can do in order to 'correct' this situation is to 'chop down' the non-natural part.That is, we consider the following operation where we have used the step function 1 s : R → F defined by: Notice that in this case β 1 β 2 is associated to the interval I(a 2 , b 1 ).More generally, suppose we want to compute 1≤ j≤m k j β j with k j ∈ F and Taking into account the definition of for two terms and also the fact that 1 a 1 b = 1 max(a,b) , we can inductively extend the definition: where A = max a j : 1 ≤ j ≤ m, k j = 0 .In the trivial case of k j = 0 for all 1 ≤ j ≤ m, we will set to zero the above definition.On the other hand, considering the value B = max b j : 1 ≤ j ≤ m, k j = 0 we have that 1≤ j≤m k j β j is associated to I(A, B).Of course these β j do not need to form a basis, so perhaps the previous sum could have a more adjusted associated interval.This operation will be of great use when working with persistence morphisms.
Remark.Let us introduce some properties of the step function 1 s for s ∈ R. For any β ∼ [a, b) one has } is a basis of V. Then 1 s ( β + γ ) and 1 s ( β + τ ) are linearly independent for all s < 1, but are equal for all s ≥ 1.Throughout this section it will be important to have these basic properties in mind. )] denotes the polynomial ring with F-coefficients and allowing all powers x r for r ∈ [0, ∞), where by convention x 0 = 1 F .Given a persistence module V, one defines a barcode vector as a morphism of

Remark. Alternatively, one can recall the definition of persistence modules as
These barcode vectors do not need to be injective.We denote by V (V) the set of all barcode vectors of V.The operation and the step function as defined above.Then, the step function , or the barcode vector; for a 1 < s; the latter is defined to be the restriction of v to the subideal (x s ), since one has that x s = x a 1 x s−a 1 .Suppose we have another barcode vector w Then, one defines the barcode sum : There is no canonical way of relating barcodes from f .
where A = max{a 1 , a 2 } and B = max{b 1 , b 2 }.In this context, a barcode basis B is a set of barcode vectors such that: (1) ) ]-linearly independent with respect to .
Notice that, while there is a uniquely determined barcode decomposition of V, the particular choice of a basis is not unique.This is analogous to the case of vector spaces, where a vector space can admit multiple bases but has always the same dimension.The main reason why we are introducing barcode bases is because we would like to work with morphisms between persistence modules f : V → W.Even though the respective barcode decompositions of V and W are determined, there is no unique 'assignment' of barcodes induced by f .In fact, it was proven in [2, prop 5.10] that matchings between barcodes of V and W cannot be defined in a functorial way.The following example will illustrate this principle.
Example 2. Consider two persistence modules: . Suppose that we had chosen an alternative barcode basis for W defined by setting β 1 = β 1 + β 2 and β 2 = β 2 .Thus, in this case we have that f (α 1 ) = 1 1 ( β 1 − β 2 ).Notice that the morphism f will relate different intervals depending on the chosen barcode bases.Therefore when studying morphisms we should not work directly with barcodes, but barcode bases instead.This is illustrated in Figure 2.
Let f : V → W be a morphism of tame persistence modules and consider two bases A and B for V and W respectively.For each barcode generator α ∼ [a, b) in A , we would like to define the image f (α) in terms of B. First notice that we have an expression for f (α)(a) in terms of B a , since this forms a basis for W a .Thus there exist coefficients k β ,α ∈ F for all β ∈ B a such that Therefore, since f is natural, we can write the image f (α) as since otherwise f would not be natural as a persistence morphism.Thus we can define the subset of B a associated to α: where α ∼ [a, b).The set B(α) contains the barcode generators β ∈ B such that the coefficients k β ,α might be non-zero.This gives us a sharper description of f (α): Notice that there is no distinction between the expression above using and the ordinary sum.This is because we have already 'cut away' the non-natural part of the sum.In particular, if associated to I(A, B), then we can deduce that A ≤ a and B ≤ b.To visualize this, consider Figure 3 illustrating the restriction of f to some barcode α ∼ I(a, b).By pointwise-linearity and naturality of f , we have that where k α ∈ F for all α ∈ A .

3.2.
Computing Kernels and Images.Let f : V → W be a morphism of tame persistence modules.The kernel of f is a persistence module Ker( f ) together with an inclusion morphism j : Ker( f ) → V, such that Ker( f ) r Ker( f r ) for all r ∈ R. Therefore, if K is a barcode basis for the kernel, then for each barcode generator κ ∼ [a, b) we have where k α,κ ∈ F for all α ∈ A .By 'finding' a basis for the kernel we mean that we want to find j(K ) in terms of the basis A .The image of f , which will be denoted as Im( f ), is a persistence module together with a projection q : V Im( f ), such that Im( f ) r Im( f r ) for all r ∈ R. Let A be a basis for V and I be a basis for Im( f ).Then for each generator γ ∈ I with γ ∼ [a, b), there exist coefficients c α,γ ∈ F such that γ = α∈A c α,γ q(α).
Notice that there was no need to multiply the above expression by 1 a , since γ is in the image of f .Thus, by finiteness of A , there must exist some α ∈ A such that q(α) has birth value a.Additionally, we will have an inclusion ι : Im( f ) → W such that f = ι • q.Notice that ι : Im( f ) → W being an inclusion, will have properties analogous to those discussed for the kernel.Hence, there will be coefficients e β ,γ ∈ F satisfying the equation: for each γ ∈ I with γ ∼ [a, b).Putting these two together and considering the image of f in terms of B, that is the equality ( f (α)) A = (β ) B b β ,α B×A , we get the matrix equation: In general, we start from the matrix (b β ,α ) B×A and will proceed to find the coefficients e β ,γ and c α,γ .This will be done by a process very similar to a Gaussian elimination.Each non-zero column e β ,γ B will lead to a barcode generator of the image.Its counterpart c α,γ A will lead to a basis for the kernel of f , although we will need to perform an additional Gaussian elimination.See Figure 4 for an illustration of these concepts.Some of these observations have already been studied in [2].
A point to notice is that there is a natural ordering for B. For any pair of barcode generators α ∼ [a, b) and β ∼ [c, d), we will write α < β whenever a < c or when we have that a = c and d < b.As before, consider two finite barcode bases A = {α i } 0≤i≤n and B = {β j } 0≤ j≤m for V and W respectively.Additionally, suppose that both A and B have total orderings.That is, even if two barcode generators are associated to the same interval α 1 , α 2 ∼ [a, b), we have already made a choice α 1 < α 2 .Then we consider M = ( f (α 1 ), . . ., f (α n )) the matrix of f in the bases A and B. The aim will be to transform M performing left to right column additions so that we obtain a matrix for suitable k i, j ∈ F and 0 ≤ i < j ≤ n.This I will have the property that its non-zero columns form a basis for Im( f ).Also, we can find coefficients q i, j ∈ F and c j ∈ F for all 0 ≤ i < j ≤ n, such that the set forms a basis for Ker( f ).In the following we will present an algorithm obtaining such bases.First we will go through an illustrative example encoding some of the basic principles of the procedure.with barcode bases (α 1 , α 2 , α 3 ) and (β 1 , β 2 , β 3 ) respectively.Let the morphism f : V → W be given by the B × A matrix: Then we will have matrices associated to f which are constant between pairs of consecutive parameters in , we start considering the matrix associated to [1, 2), together with its reduction by columns, Next we consider F 2 along the interval [2, 3), which will inherit the previous reduction.Since a generator on the domain is being born, we add a new column at the right end of F 2 .This will be reduced by subtracting the first two columns from the last one, . Decomposition of barcodes in image, kernel, domain and codomain of f : V → W. The colors correspond to the different generators associated to I and K .Now, we compute the matrix F 3 of f along [3,4).We start from R(F 2 ) and we take out the second row, since its associated interval ends at 3. Thus, we obtain F 3 which is already reduced, Since the second column is zero this means that a barcode has finished on the image.Thus, we add f (α 2 − α 1 ) = −1 1 β 2 into I .Additionally, we add 1 3 ( α 2 − α 1 ) into K .The next interval to consider is [4,5).Now, before looking at the matrix F 4 of f along [4, 5), we consider 1 4 (K ).That is, we look at the element . This already tells us extra information about the kernel of F 4 .We check this when we compute F 4 , Notice that we do not need to add Again, this is because a barcode generator has finished in the image of f .The reason why we are adding f (−α 1 ) instead of f (α 1 ) to I , is because we detected this barcode from 1 4 (K ).Finally, since all generators in A die at 5, we add f (α 3 − α 2 ) = 1 2 ( β 1 ) into I .Altogether we have obtained a basis for the kernel and also a basis for the image: Therefore we obtain isomorphisms Ker( f ) I(3, 5) and Im( f ) I(1, 3) ⊕ I(1, 4) ⊕ I(2, 5) with respective barcode bases K and I .This is illustrated on Figure 5.In practise, instead of adding elements to I , we will set I to be equal to f (A ) B and perform the corresponding reductions until we obtain a basis for the image of f .

3.3.
Algorithm.Here, we present an algorithm performing the above procedure.Suppose that f : V → W is a morphism between two tame persistence modules.Let A and B be barcode bases for V and W respectively.Suppose also that we know f (A ) B , the matrix associated to f with respect to barcode bases A and B. We want to find a barcode basis for the image I , and a barcode basis for the kernel K .In order to achieve this, I will start being set to be equal to the |B| × |A | matrix f (A ) B .Performing left to right column additions will lead to the nonzero columns of I forming a basis for the image.On the other hand, K will be a matrix with |A | + 1 rows and whose number of columns will 'grow' as the computations develop.The extra row will be used for storing the parameter of the multiplying step function.Notice that K will have at most |A | columns, which is useful to know if we wanted to preallocate space for speed.
Notice that there exist values We start by computing the values a i for all 0 ≤ i ≤ n.We will denote by A a i ( j) the index 1 ≤ A a i ( j) ≤ |A | of the j-element from A a i .Also given a matrix A, we will denote by A[ j] the j th column of A. The matrices R i will denote the successive Gaussian reductions as we increase the parameter 0 ≤ i ≤ n + 1.That is, we start with R 0 which will be the |B a 0 | × |A a 0 |-matrix of f along the interval a 0 < a 1 , then we reduce it to R 0 .Simultaneously, we perform exactly the same transformations to I .In order to track these additions performed, we will use a |A | × |A | matrix T .This T will be the identity matrix Id |A | .Thus, whenever we add columns in R 0 we perform the same additions in T .On the other hand, if some column R 0 [ j] becomes zero, where 1 ≤ j ≤ |A a 0 |, we add T [A a 0 ( j)] at the right end of the matrix of kernels K 0 .Additionally, we append T [A a 0 ( j)] to K , with associated step function coefficient a 0 .Since we require K to be linearly independent, we will introduce a set pivots for tracking the pivots of the elements in K .For each T [A a 0 ( j)] that we add into K , we add A a 0 ( j) into pivots.Note that in this first step there will be no repeated elements in pivots and the matrix K will be already reduced.
Once we finish, we jump to the next parameter a 1 .
Let us go through the procedure for a 1 .For this, we add or take out rows and columns from R 0 and K 0 according to the life of each generator in A and B; these changes are stored into R 1 and K 1 , respectively.
Observe that K 1 might not be reduced.Since we would like to obtain a basis for the kernel of f , we reduce it further to K 1 = R( K 1 ), performing the same additions on K .Next we proceed to reduce R 1 .There is a trick we can use here to speed up the computations.For each j-column in K 1 , if the pivot p of the column is such that A a 1 (p) is not in pivots, this means that the p column in R 1 will become zero after reducing.
Then we set R 1 [p] to zero directly, substitute the column ), and add A a 1 (p) into pivots.Here by f (K 1 [ j]) we mean the result after adding the columns from f (A ) B with coefficients given by K 1 [ j].Notice that this is the same as performing left to right column additions to the column although we also permit this column be multiplied by a non-zero coefficient t ∈ F \ {0}.After performing these preprocessing tasks, we reduce R 1 into R 1 , repeating the same transformations to T and I .Then we examine R 1 , and look for columns 1 is not in pivots.For each such column j, we append T [A a 1 ( j)] at the right end of K 1 , and also into K with birth value a 1 .Finally, we add A a 1 ( j) into pivots.This finishes the iteration for a 1 .
We repeat the previous step again for parameters a 1 < a 2 < • • • < a n .On the i iteration, where 2 ≤ i ≤ n, we assume that we have well defined matrices R i−1 and K i−1 .As before, we update these matrices into a B a i × A a i -matrix R i , and a matrix with |A a i | columns K i .These updates are performed by adding and deleting columns as the barcodes from A and B are born or die respectively.The rest of the procedure for a i is exactly as we outlined for a 1 earlier.Notice that while we are on the ith step, both K i and K will have the same number of columns.An outline of this procedure is shown in Algorithm 1.

Algorithm 1 image kernel
Update R i and K i from R i−1 , and K i−1 respectively 5: Reduce K i obtaining K i .Perform the same reductions to K 6: for each j-column of K i with pivot p such that A a i (p) / ∈ pivots do 7: Add A a i (p) into pivots 10: end for 11: Reduce R i into R i .Perform the same reductions to T and I 12: Append T [A a i ( j)] at end of K i , and also at K with step coefficient a i 15: Add A a i ( j) into pivots Proof.The key observation is that K forms a barcode basis for Ker( f ) if and only if K r is a basis for Ker( f ) r for all r ∈ R. Now, notice that K r generates Ker( f ) r since all kernel elements were sent to K .On the other hand, each K r is a linearly independent set, since we have performed Gaussian eliminations that ensured this.Similarly, for any r ∈ R we have that I r generates all the columns from f (A ) r B , and thus it generates Im( f ) r .We have also ensured linear independence of I r by the Gaussian elimination process.Thus, I is a barcode basis for Im( f ).
Let us compute the complexity of the algorithm.We start noticing that n comes from the outer loop.Then the Gaussian reduction of Notice that this estimate O(nM|A | 2 ) can be improved in practice, since most of the values a i will only indicate a single birth or death on either the image or the kernel.Coming up with an efficient algorithm for this task is an interesting question that goes beyond the scope of this paper.

3.4.
Computing Quotients.Now we consider the problem of computing quotients.Suppose that we have inclusions H ⊆ G ⊆ V of finite persistence modules of dimensions H ≤ G ≤ B respectively.Furthermore, suppose that H = {h j } 1≤ j≤H , G = {g k } 1≤k≤G and B = {b i } 1≤i≤B are barcode bases for H, G and V respectively.The aim will be to find a barcode basis for G/H.For each generator h j ∈ H , we will use the superscript notation h j ∼ [a h j , b h j ) for the associated interval.Also H will be ordered in a way such that a h i ≤ a h j whenever 1 ≤ i ≤ j ≤ H.The same conventions will be used for the bases B and G .Then there exists a matrix M = (m i, j ) H,B ∈ M H×B (F) such that where the operation is implicit on the equation.We will write this in the more compact form h = 1 H Mb.
Similarly, there exists a matrix N ∈ M G×B (F) such that g = 1 G Nb.
Consider the inclusions ι H : H → V and ι G : G → V, and define the morphism R : Thus we have that: Hence, in order to compute a basis for the quotient, all that we need to do is apply image kernel to the matrix h T | g T B .The last |G | nontrivial generators from I lead to a basis for G/H.

3.5.
Homology of Persistence Modules.Consider a chain of tame persistence modules: (5) 0 where each term has basis B j for 0 ≤ j ≤ n.Then applying image kernel we will obtain bases I j−1 and K j for the image and kernel of d j for all 0 ≤ j ≤ n.Proceeding as on the previous section, we consider matrices (R j (I j | K j )) B j and apply again image kernel.This leads to bases Q j for the homology for all 0 ≤ j ≤ n.

A REVIEW ON THE MAYER-VIETORIS SPECTRAL SEQUENCE
In this section, we give an introduction to the Mayer-Vietoris spectral sequence.This section has no claims of originality.These ideas come mainly from [4,23].The reason for including this is because we think it beneficial to outline a minimal, self-contained explanation of the procedure.Also, we will be using this as a necessary background for Section 5.For simplicity we will focus on ordinary homology over a field F.
Later on we will extend these ideas to the case of persistent homology over a field.
Let K be a simplicial complex, and U = {U i } 0≤i≤m be a cover of K by subcomplexes.Suppose that we want to compute the homology of K from the cover elements.Then a naive approach to solving the problem, would be to compute the homology groups H n (U i ), and proceed by adding all of them back together: Unfortunately, this is hardly ever true and we will need to find other ways of dealing with this merging of information.To introduce the distributed problem, we forget about simplicial complexes, and go back to the domain of topological spaces and open covers.
4.1.The Mayer-Vietoris theorem.Consider torus T 2 covered by two cylinders U and V , as illustrated in Figure 6.Then one sees that equality (6) does not hold in dimensions 0 and 2: In order to amend this, one has to look at the information given by the intersection U ∩V .This information comes as identifications and new loops.For example, U and V are connected through the intersection.Also, the loop going around each cylinder U and V is identified in the intersection.These identifications are performed by taking the quotient for all n ≥ 0.Where the previous morphism is the Čech differential δ n 1 : S n (U ∩ V ) → S n (U) ⊕ S n (V ).Additionally, the 1-loops in the intersection merge to the same loop when included in each cylinder U or V .This situation creates a 2-loop or 'void', see Figure 6.Thus we have the n-loops detected by the kernel for all n ≥ 0. Notice that n-loops are found by n − 1 information on the intersection.Putting all together, we have that This leads to the expected result On a more theoretical level, what we have presented here is commonly known as the Mayer-Vietoris theorem.We can think of each homology group H n (U ∪V ) as a filtered object, Then, the Mayer-Vietoris theorem gives us the expressions for the different ratios between consecutive filtrations, In particular, since we are working with vector spaces we obtain Loops: . Torus covered by a pair of cylinders U and V .
The above discussion gives rise to the total chain complex, ) for all n ≥ 0. Notice that the first two morphisms do not change components, whereas the third encodes the 'merging' of information.This last morphism is represented by red arrows on the diagram: where the rectangle of red arrows is commutative.In particular, this implies that d Tot n •d Tot n+1 = 0 for all n ≥ 0. Computing the homology with respect to the total differentials and using the previous characterization of I n and L n , one obtains This result will be further generalized in proposition 5.

4.2.
The Mayer-Vietoris spectral sequence.After this digression, we move back to a simplicial complex K with a covering U = {U i } m i=0 by subcomplexes.In this case, we need to take into account all the intersections between different subcomplexes.We can extend the intuition from the previous subsection, by recalling the definition of the (n, U )-Čech chain complex given on the preliminaries.Stacking all these sequences on top of each other, and also multiplying differentials in odd rows by −1, we obtain a diagram: 0 This leads to a double complex (S * , * , δ , d) defined as for all p, q ≥ 0, and also S p,q := 0 otherwise.We denote δ = (−1) q δ , the Čech differential multiplied by a −1 on odd rows.The reason for this change of sign is because we want S * , * to be a double complex, in the sense that the following equalities hold: Since S * , * is a double complex, we can study the associated chain complex S Tot * , commonly known as the total complex.This is formed by taking the sums of anti-diagonals for each n ≥ 0. The differentials on the total complex are defined by d Tot = d + δ , which satisfy d Tot • d Tot = 0 from equations (7), see Figure 7 for a depiction of this.Later, in proposition 5, we will prove that H n (K) ∼ = H n (S Tot * ) for all n ≥ 0. The problem still remains difficult, since computing H n (S Tot * ) directly might be even harder than computing H n (K).The key is that there is a divide and conquer method which allows us to break apart the calculation of H n (S Tot * ) into small, computable steps.
On the left, in cyan the four direct summands of Ker(d Tot ) 4 .The corresponding GK r,3−r are framed to emphasize that they are respective subspaces of S r,3−r for all 0 ≤ r ≤ 3. On the right, in orange the subspaces GZ r 2,1 , eventually shrinking to GK 2,1 .For convenience, we have labelled

Let us start by computing the kernel Ker(d Tot
n ), which is depicted in Figure 7. Recall that we will be working with vector spaces and linear maps all throughout.Let s = (s k,n−k ) 0≤k≤n ∈ S Tot n be in Ker(d Tot n ).Then s will satisfy the equations d(s k,n−k ) = − δ (s k+1,n−k−1 ) for all 0 ≤ k < n.Thus, one can obtain kernel elements by considering subspaces GK p,q ⊆ S p,q .The subspace GK p,q is composed of elements s p,q ∈ S p,q such that d(s p,q ) = 0, and there exists a sequence s p−r,q+r ∈ S p−r,q+r satisfying equations d(s p−r,q+r ) = − δ (s p−r−1,q+r+1 ) for all 0 < r ≤ p.Notice that GK p,q is a subspace of S p,q since both d and δ are linear.We will see that one has (non-canonical) isomorphisms, (8) Ker This is depicted in Figure 8.It turns out that this is true only when we are working with vector spaces.Later, we will work with a more general case where such isomorphisms do not hold.This will be known as the extension problem.
Hence, recovering the sets GK p,q leads to the kernel of d Tot n .The problem with this approach is that each subspace GK p,q still requires a large set of equations to be checked.A step-by-step way of computing these is by adding one equation at a time.For this we define the subspaces GZ r p,q ⊆ S p,q where we add the first r equations progressively.That is, we start setting GZ 0 p,q = S p,q .Then we define GZ 1 p,q to be elements s p,q ∈ S p,q such that d(s p,q ) = 0, or equivalently GZ 1  p,q = Ker(d) p,q .In an inductive way, for r ≥ 2 we define GZ r p,q to be formed by elements s p,q ∈ GZ r−1 p,q such that there exists a sequence s p−k,q+k ∈ S p−k,q+k satisfying equations d(s p−k,q+k ) = − δ (s p−k+1,q+k−1 ) for all 1 ≤ k < r.Then, for all p, q ≥ 0, we have a decreasing sequence GK p,q = GZ p+1 p,q ⊆ GZ p p,q ⊆ • • • ⊆ GZ 0 p,q = S p,q .
For intuition see Figure 8, and also Figure 10 for a depiction of GZ 2 3,1 on a lattice.A very compact way of expressing that is by the definition GZ r p,q = Ker(d) ∩ ( δ −1 • d) r−1 (S p−r+1,q+r−1 ) for all r ≥ 1, where by On the left, the different subspaces on S 2,1 .
Here IB r 2,1 = Im B r 2,1 → GZ r+1 2,1 , for all 0 ≤ r ≤ 2. The framed region represents S 2,1 .Brighter colours represent bigger regions than darker colours.Note that blue and orange colours have been assigned to GZ * 2,1 and IB * 2,1 respectively.On the right, the morphism d 2 : on the second page.The two framed regions represent the codomain and domain of d 2 , these have been assigned brighter and darker colours, respectively.then the second page will be .
The second page has differential d 2 induced by the total complex differential d Tot .Figure 11 illustrates this principle.
Doing the same for all pages we obtain the definition of the r-page: for all r ≥ 2. Of course, we can express alternatively the r-page terms as: Thus, the ∞-page is: Then, for n = p + q one has the equality Therefore, computing the spectral sequence is a way of approximating the associated module G p V H n (S Tot * ).Thus adding up all of these leads to the result H n (S Tot * ).By convention, since we say that E * p,q converges to H n (S Tot ) and we denote this as Here we have adopted the definition of Z r p,q and B r p,q that one can find in [23].Other sources such as [4] and [22] use the same notation for other terms.
So far, we have studied spectral sequences for vertical filtrations.Similarly, there is a horizontal filtration, F r H S Tot n := p+q=n q≤r S p,q , for all r ≥ 0. We can apply the same argument to this filtration, to obtain a spectral sequence ).An intuitive way of thinking of this is by applying a symmetry about the diagonal x = y on the previous discussion.Thus the first page is computed with the homology with respect to horizontal differentials, the second with respect to vertical differentials, and so on.This leads easily to the following widely known result: Proof.In order to turn to the first page, we need to compute homology with respect to the horizontal differentials δ .As shown in the preliminaries, the Čech chain complexes are exact, so that: S q (K) if p = 0 and q ≥ 0 0 otherwise.
After this one can compute the second page by the homology with respect to vertical differentials d induced on the first page, H q (K) if p = 0 and q ≥ 0 0 otherwise.To proceed to the next page, we would need to consider homology with respect to diagonal differentials, p,q has only one non-zero column p = 0, computing homology with respect to d 2 leaves this page intact.The same happens when we consider for any r > 2 homology with respect to differentials d r : H E r p,q −→ H E r p+r−1,q−r .Thus, we say that H E * p,q has collapsed on the second page, which is usually denoted as H E 2 p,q = H E ∞ p,q .Each diagonal has a unique nonzero entry H E ∞ 0,q ∼ = H q (K).In particular, we have isomorphisms Therefore, using proposition 5, we have that the spectral sequence converges to the wanted result In particular, since we are in the category of vector spaces, there are no extension problems.Thus, we have an isomorphism Throughout the following section, we will adapt this setting to the category of persistence modules.

PERSISTENT MAYER-VIETORIS
One can translate the method from section 4 to PMod.The reason for this is that PMod is an abelian category, since Vect is an abelian category and R is a small category.The theory of spectral sequences can be developed for arbitrary abelian categories.For an introduction to this, see chapter 5 in [30].
Suppose that we have covered a filtered simplicial complex K with filtered subcomplexes U = {U i } i∈I , so that K = i∈I U i .Then, we can compute the spectral sequence where p + q = n.However, unlike the case of vector spaces, we might have that As the radius increases, more edges are added.At radius r = 0.5 a circle will be across the two covers U and V .Later on, at radius r = 0.6 this circle will be split into two.
. Barcode on associated module.
All that we know is that E ∞ p,q ∼ = G p PH p+q (K) for all p, q ≥ 0. This is the extension problem, which we will solve in Section 5.1.After solving this problem we will obtain the persistent homology for K.We will even recover more information.Notice that as pointed out in [31], the knowledge of which subset J ⊂ I detects a feature from PH n (K) can potentially add insight into the information given by ordinary persistent homology.The following example illustrates this.Example 6.Consider the case of a point cloud X covered by two open sets as in Figure 12.From Sections 3 and 4, we know how to compute the ∞-page (E ∞ * , * ) r associated to any value r ∈ R. In particular, when we take r = 0.5, then the combination of U and V detects a 1-cycle.On the other hand, when r = 0.6 this cycle splits into two smaller cycles which are detected by U and V individually.Notice that if we want to come up with a persistent Mayer-Vietoris method then we need to be able to track this behaviour.That is, we need to know how cycles develop as r increases.In particular, the barcode I(0.5, 1) from PH 1 (X) will be broken down into some smaller barcodes, see diagram 13.These will be E ∞ 1,0 ∼ = I(0.5, 0.6) and also E ∞ 0,1 ∼ = I(0.6,1.0) ⊕ I(0.6, 1.0).The way we will solve this problem is by using the barcode basis machinery developed in Section 3.

The Extension Problem:
Recall the definition of the total complex, vertical filtrations and associated modules from section 4. Through this section we study the extension problem, that is, we will recover H n (S Tot * ) from the associated modules G p V H n S Tot * . Also, we will assume that the spectral sequence collapses after a finite number of pages.Consider the persistence module We define the associated modules as the quotients G k = F k V/F k−1 V for all 0 ≤ k ≤ n.This gives rise to short exact sequences, (11) 0 Adding up all associated modules we obtain a persistence module G := n i=0 G i with an additional filtration given by . A one loop is detected at value r ∼ 0.208 which goes through three covers.
Later, at radius r = 0.5, this loop splits into three loops, each included in one of the three covers.
sequence algorithm will lead to a barcode basis for G.The extension problem consists in computing a basis B for V from a basis G of G.
To start, notice that for each r ∈ R the sequence (11) splits, leading to morphisms Notice that if we already computed G from the Mayer-Vietoris spectral sequence, then there is no need to do any extra computations to obtain these morphisms F k (r).All we need to do is to store our previous results.Adding over all 0 ≤ k ≤ n we obtain the isomorphism F (r) = n k=0 F k (r) : n k=0 G k (r) → V(r).This last morphism is an isomorphism since all its summands are injective, their images have mutual trivial intersection, and the dimensions of the domain and codomain coincide.
Recall that G has induced morphisms G(r ≤ s) from V(r ≤ s) for all values r ≤ s in R. Given a basis G for G, we would like to compute a basis B for V from this information.Notice that this is not a straightforward problem since (12) does not imply that one has an isomorphism F : G → V.A point to start is to define the image along each generator in G .That is, for each barcode generator g i ∼ [a i , b i ) in G , we choose an image at the start F (a i )(g i (a i )).After, we set F (r)(g i (r)) := V(a i < r) • F (a i )(g i (a i )) for all a i < r < b i .This leads to commutativity of F along each generator g i .Nevertheless this is still far from even defining a morphism F : G → V.
The solution to the problem above is to define a new persistence module G.We define G(s) := G(s) for all s ∈ R.Then, if G = {g i } 1≤i≤G is a barcode basis for G, we will have that G (s) will be a basis of G(s) for all s ∈ R. Now, given g i ∼ [a i , b i ) a generator in G , we define the morphism G(r ≤ s) by the recursive formula where c i, j ∈ F for all 1 ≤ i, j ≤ G.We want to define c i, j in such a way that G is isomorphic to V. For this we impose the commutativity condition which leads to the equation This determines uniquely the coefficients c i, j for all 1 ≤ i, j ≤ G. Notice that G respects the filtration on V, since the right hand side in ( 13) is a composition of filtration preserving morphisms.In particular, if Fix a generator g i ∈ G k with associated interval [a i , b i ).Let us calculate the coefficients c i, j .Suppose that we have a representative g j = (β j 0 , β j 1 , . . ., β j k , 0, . . ., 0) ∈ S Tot n for each generator g j ∈ G , with g j = [β j k ] ∞ k,n−k .Also, for all 0 ≤ q ≤ n we define the subset I q ⊆ {1, . . ., G} of indices 1 ≤ j ≤ G such that g j ∈ G q .Then the coefficients c i, j for j ∈ I k \ {i} are determined by the equality in G k (b i ) Thus, we have n denotes the n-homology class of the total complex.Hence, by (11) there must exist some How do we compute γ?We start by searching for the first page r ≥ 2 such that (15) where [•] r k,n−k denotes the class in the r-page in position (k, n − k).Notice that this r must exist since we assumed that (15) vanishes on the ∞-page.In fact, there exists Repeating for all pages leads to γ k+t ∈ E t k+t,n−k−t+1 (b i ) for all 0 ≤ t ≤ r − 1, such that ( 16) Notice that equation ( 16) holds independently of the representatives, since if we changed some term, then the other representatives would adjust to the change.In particular, we have that the k component of ( 14) vanishes, whereas the k − 1 component will be equal to Next we proceed to find coefficients c i, j ∈ F so that in G k−1 (b i ) we get the equality Then we proceed as we did on G k .Doing this for all parameters 0 ≤ r ≤ k, there are coefficients c i, j ∈ F, and an element γ ∈ S Tot n (b i ) so that Thus, Proof.Since each F (s) is an isomorphism, and also we have commutative squares: G(s) for all r ≤ s, then F must be an isomorphism of persistence modules.This gives G ∼ = V, but we still need to compute a barcode basis.In fact, this can be done by applying the algorithm image kernel, but with barcode updates given by the morphisms of G.The set I which results from this procedure will be a barcode basis for G, which by proposition 7 leads to a barcode basis for V.

PERMAVISS.
Here we outline a procedure for implementing the persistence Mayer-Vietoris spectral sequence.Notice that while using the submodules GZ r p,q and IB r p,q is a more intuitive approach from a mathematical perspective, it is more efficient to work directly with the sets Z r p,q and B r p,q .By storing representatives in Z r p,q , we avoid repeating computations on each page and in the extension problem.Furthermore, this approach allows to easily track the complexity of the algorithm.The current implementation of PERMAVISS (v.0.0.2) uses the sets GZ r p,q and IB r p,q .However, future versions will implement the method described here, since it is more efficient and parallelizable.0-Page.We start by defining the 0-page as the quotient for all pair of integers p, q ≥ 0. The 0 differential d 0 , is isomorphic to the standard chain differential d 0 p,q ∼ = d q : S p,q → S p,q−1 .
In particular, for each simplex σ ∈ N U q , the morphism d 0 p,q restricts to a local differential d σ q : S q (U σ ) → S q−1 (U σ ).
Thus, we can compute persistent homology to obtain a local base for the image Im(d σ q+1 ) and the homology E 1 σ ,q .Putting all of these together, we get a basis for E 1 p,q as the union E 1 p,q = σ ∈N U p E 1 σ ,q .Further, for each generator α ∈ E 1 p,q ⊆ E 0 p,q , we store a chain α p ∈ S p,q so that α = [(0, . . ., 0, α p , 0, . . ., 0)] 0 .Where we denote by [•] r a class in E r p,q for all r ≥ 0.
1-Page.Recall that the first page elements are given as classes in the quotient Therefore, for each generator α ∈ E 1 p,q ⊆ E 0 p,q , with α ∼ [a α , b α ), there is a chain α p ∈ S p,q , so that α = [(0, . . ., 0, α p , 0, . . ., 0)] 0 .Then we compute the image of d 1 on [α] 1 d 1 [α] 1 = d Tot (0, . . ., 0, α p , 0, . . ., 0) 1 = 0, . . ., 0, δp (α p ), 0, . . ., 0 1 .Now, for each simplex τ ∈ N U p−1 , we have local coordinates δp (α p ) τ ∈ S q (U τ ).We proceed to solve the linear equation at a a α X = δp (α p ) τ , where the vector X has as many entries as needed for the equation to make sense.Also, we have used τ,q (a α ) for denoting the matrix on value a α , and whose rows correspond to a basis of S q (U τ ).The solution X leads to coefficients c 1 β ∈ F for all β ∈ E 1 τ,q and an element a τ ∈ S q+1 (U τ ) so that Repeating this for all τ ∈ N U p−1 , we get coefficients c 1 β ∈ F for all β ∈ E 1 p−1,q as well as a chain Here we define the representative α = (0, . . ., 0, a p−1 , α p , 0, . . ., 0) ∈ S Tot p+q , and repeating this for all generators in E 1 p,q , we get a set of corresponding representatives E 1 p,q .On the other hand, the computed coefficients c 1 β mean that d 1 p,q performs the assignment Thus, we obtain an associated matrix D 1 p,q for d 1 p,q .Using image kernel, we compute bases for the kernel and image.Additionally, for each generator j ∈ Im(d 1 p,q ), we store a preimage p j ∈ E 1 p+1,q such that d 1 (p j ) = j.This can be done by storing coefficients c 1 γ for all γ ∈ E 1 p+1,q so that p j = ∑ γ∈E 1 p+1,q c 1 γ γ.Notice that these coefficients are given by image kernel by asking to return the matrix T .This leads to the second page by applying image kernel to compute the quotient Ker(d 1 )/Im(d 1 ), obtaining bases E 2 p,q .
2-Page.Now, we proceed to compute the third page.We start from α ∈ E 2 p,q ⊆ E 1 p,q , with α ∼ [a α , b α ) and coordinates α = (b β ) β ∈E 1 p,q .Then, this leads to a total complex representative α = (0, . . ., 0, Since α ∈ Ker(d 1 ), we have that As before, by solving local linear equations, we can compute coefficients c Now, we solve the linear equation on X and value a α ∈ R .
Repeating this for all α ∈ E 2 p,1 , we obtain a matrix D 2 p,q associated to d 2 p,q .Then applying image kernel we obtain bases for the kernel, images and preimages.Then, applying image kernel one more time we obtain generators for the third page E 3 p,q .
k-Page.Suppose that we have computed generators E k p,q ⊆ E k−1 p,q , together with total complex representatives E k−1 p,q for k ≥ 3. Let a generator α ∈ E k p,q with α ∼ [a α , b α ) and coordinates (b β ) β ∈E k−1 p,q .Then, we define a representative α = (0, . . ., 0, α p−k+1 , . . ., α p , 0, . . ., 0) = ∑ p,q and as a consequence On the other hand, for p − k + 1 > 0, we 'lift' d Tot ( α) to the k-page.We start on the 0-page, where we can repeat the procedure outlined on the 1-page, to obtain coefficients , and an element a ∈ S p−k,q+k so that δp−k+1 Next, for each k ≥ r ≥ 2, we solve the linear equation on X and on value a α ∈ R Eventually, we obtain the coefficients . This leads to the associated matrices, and then we can compute image kernel, etc.On the other hand, we redefine the representative of α as α = (0, . . ., 0, a, α p−k+1 , . . ., α p , 0, . . ., 0) This leads to the set of representatives E k p,q ⊆ Z k p,q .
5.3.Extension Problem.After computing all pages of the spectral sequence, we still have to solve the extension problem.It turns out that the procedure is almost exactly the same as for when computing a page on the spectral sequence.We start from a basis E ∞ p,q , with total complex representatives E ∞ p,q .Since we assume that the spectral sequence is bounded, it collapses at an L > 0 page.Then, for each generator α ∈ E L p,q , with α ∼ [a α , b α ), we have a corresponding representative α = (α 0 , . . ., α p , 0, . . ., 0) ∈ S Tot p+q in E L p,q .The main procedure consists in lifting α p to the L-page.We do this by means of local linear equations as done on the 1-page.However, this time, instead of using the value a α we use b α .This leads to a ∈ S p,q+1 and coefficients (c 1 β ) β ∈E 1 p,q so that The same happens for all the pages 1 ≤ r ≤ L, where all the linear equations are using the value b α .This leads to coefficients (c r γ ) γ∈E r p+r,q−r+1 for all 1 ≤ r ≤ L − 1, and also (c L β ) β ∈E L p,q .Then, we define In particular, notice that [ α p−1 ] L = 0.In fact, for all integers L − 1 ≥ r ≥ 0 one has that [ α p−1 ] r = 0, since both the adding and substracting terms are a sum of elements in E r p,q with the same coefficients.As a consequence the p-component of α p−1 vanishes, so α p−1 ∈ F p−1 S Tot p+q .Then, one can repeat this process with α r for all p − 1 ≥ r ≥ 0. This leads to all coefficients (c L β ) β ∈E L p−r,q+r for all 0 ≤ r ≤ p, which solves the extension problem.That is, we have an assignment and a matrix associated to the extensions.Then, applying image kernel, we obtain a barcode basis for persistent homology.This is more efficient than the solution presented in section 5.1, however, the former is more intuitive.Notice that X ≥ Y .On the other hand, we define Let n be the number of values in R where some bar changes in the first page generators E 1 p,q .Notice that one has n ≤ 4H.Assume P is the number of processors.0-page.When computing the first page, all we need to do is calculate persistent homology in parallel.Then, the complexity is This leads to generators for the first page.
1-page.For the first page, recall that we start from a generator α ∈ E 1 p,q with α ∼ [a α , b α ) and proceed to solve |N U p−1 | linear equations.Notice that this can be done for all generators from E 1 p,q simultaneously.This is because as the value a α changes, only columns are added and removed to the local linear equations, leaving the rows intact.On the other hand, we need to execute image kernel on at most dim(N U )D s elements on the first page.Notice that for each of these, we first compute a basis for the images and kernels, and afterwards we perform the quotients.Each of these takes a complexity of at most O(4H 4 ).Also, we need to add the complexity of the Čech differential.An option for computing this, is to compare simplices in different covers by their vertices; two simplices are the same iff they share the same vertex set.This would take less than O(|N U |D s X k-page.Now, we proceed for the complexity of the page k ≥ 2. This is the same as for the 1 page, with the addition of Gaussian eliminations of higher pages.These take at most O(H 2 ) time for each generator in E r p,q .If we do these for all generators simultaneously, since we need to update both rows and columns in a matrix, we might use image kernel and the complexity becomes O(nH 3 ).Denoting by L the infinity page, we have the new term dim(N U )D s P O(4LH Extension problem.If the spectral sequence collapses at L > 0, then the complexity of extending all generators in E L p,q is bounded by that of computing the L page about D s times.
Overall complexity.Altogether, we have a complexity bounded by that of computing the first page plus that of computing the L page L + D s times.Here the L comes from computing the L page L times and D s from the extension problem.Thus, the overal complexity is bounded by Notice that in general D s , L and dim(N U ) are much smaller than H and X.Thus, for covers such that Y X and |N U | X, and assuming we have enough processors, the complexity can be simplified to the two dominating terms O(X 3 ) + O(H 4 ).
Notice that this last case is satisfied for those covers whose mutual intersections are generally smaller than each cover.Also, in this case H is approximately of the order of nontrivial barcodes over all the input complex.This shows that PERMAVISS isolates simplicial data, while only merging homological information.It is worth to notice that in general H, being the number of nontrivial bars, is much smaller than the size of the whole simplicial complex.However, in some cases this might not be true.Nevertheless our complexity estimates are very generous, leaving plenty of space for improvement on concrete applications.

CONCLUSION
We started by developing linear algebra for persistence modules.In doing so, we introduced bases of persistence modules, as well as associated matrices to morphisms.Also, we presented Algorithm 1, which computes bases for the image and the kernel of a persistence morphism between any pair of tame persistence modules.Then a generalization of traditional persistent homology was introduced in Subsection 3.5.This theory, has helped us to define and understand the Persistent Mayer-Vietoris spectral sequence.Furthermore, we have provided specific guidelines for a distributed algorithm, with a solution to the extension problem presented in Section 5.1.The PERMAVISS method presented in section 5.2 isolates simplicial information to local matrices, while merging only homological information between different covers.Thus, the complexity of this method is dominated by the size of a local complex plus the order of barcodes over all the data.A first implementation of these results can be found in [29].Coding an efficient implementation from the pseudo-code given in this paper, and benchmarking its performance compared to other methods, will be a matter of future research.Another interesting direction of research is how to merge this method with existing algorithms, such as those from [8,9,14,24].Especially it would be interesting to explore the possible interactions of discrete Morse theory and this approach, see [12].Additionally, it will be worth exploring, both theoretically and practically, which are the most suitable covers for different applications.Finally, we would also like to study the additional information given by the covering.This will add locality information from persistent homology.In particular, it is worth noticing that on experiments the two most expensive pages to compute are the first and second one.This is why we have a strong belief that most of the extra information will be contained in the first two pages.
Since j is an injection and κ ∼ [a, b) is a basis generator, the image j(κ) needs to be non-zero along the interval [a, b).Thus if α∈A (κ) (k α,κ α) is associated to the interval [A, B), then by injectivity b ≤ B. On the other hand, since the image j(κ) is associated to [A, B), then B ≤ b by naturality of j, whence we obtain the equality B = b.Since A (κ) is finite, there must exist some α ∈ A (κ) with death value b α = b.
K i might take at most O(|A | 3 ) time.On the other hand the reduction of R i might take O(|B||A | 2 ) time.The first inner loop will take less than O(|A |(log(|A |) + |A ||B|)) time, where the multiplying |A | comes from the iteration.Within round brackets, the first term comes from checking pivots by a hash table or similar, whereas the second comes from computing f (K i [ j]).The second inner loop takes O(|A |log(|A |)) time, where |A | is for the iteration and log(|A |) for checking pivots.Putting all together we obtain the following complexity:

5. 4 .
Complexity Analysis.Let D s be the maximum simplex dimension in K, and dim(N U ) the dimension of the nerve.Let L be the number of pages.DenoteN U ≥1 = k≥1 N U k .Let X = max q≥0, σ ∈N U |S q (U σ )| and let Y = max q≥0, σ ∈N U ≥1 |S q (U σ )| .