A Framework for Differential Calculus on Persistence Barcodes

We define notions of differentiability for maps from and to the space of persistence barcodes. Inspired by the theory of diffeological spaces, the proposed framework uses lifts to the space of ordered barcodes, from which derivatives can be computed. The two derived notions of differentiability (respectively from and to the space of barcodes) combine together naturally to produce a chain rule that enables the use of gradient descent for objective functions factoring through the space of barcodes. We illustrate the versatility of this framework by showing how it can be used to analyze the smoothness of various parametrized families of filtrations arising in topological data analysis.


Motivation
Barcodes have been introduced in topological data analysis (TDA) as a means to encode the topological structure of spaces and real-valued functions. They have been shown to provide complementary information compared to classical geometric or statistical methods, which explains their interest for applications. However, so far they have been essentially used as an alternative representation of the input, engineered by the user, as opposed to optimized to fit the problem best.
Optimizing barcodes using e.g. gradient descent requires to differentiate objective functions that factor through the space Bar of barcodes: where M is a parameter space equipped with a differential structure, typically a smooth finite-dimensional manifold. A compelling example arises in the context of supervised learning, where the barcodes can be used as features for data, generated by using some filter function f : K Ñ R on a fixed graph or simplicial complex K. Instead of considering f as a hyperparameter, it can be beneficial to optimize it among a family tf θ : K Ñ Ru θPM parametrized by a smooth map which we call the parametrization: Post-composing F with the persistent homology operator Dgm p in homology degree p yields a map Dgm p˝F : M Ñ Bar. Given a loss function L : Bar Ñ R, the goal is then to minimize the functional M Dgm p˝F using variational approaches, which are standard in large-scale learning applications. In order to do so, we need to put a sensible smooth structure on Bar and to derive an analogue of the chain rule, so that we can compute the differential of L˝Dgm p˝F as the composition of the differentials of L and Dgm p˝F . The difficulty arises as Bar is not a manifold and so far has not been given a structure in which the above makes sense.
Beyond optimization, we want to be able to address other types of applications where differential calculus is involved. For this, a variety of potential scenarios must be considered, e.g. when the filter functions are defined on a fixed smooth manifold, or when the second arrow in (1) takes its values in R n or more generally in some smooth finite-dimensional manifold. The goal of our study is to provide a unified framework that accounts for all these scenarios.

Related work
Despite the lack of a smooth structure on the space Bar, developing heuristic methods to differentiate the composition in Equation (2) has been an active direction of research lately, leading to innovative computational applications. In Table 1, we specify, for each of these contributions, the choice of parametrization F and of loss function L, the optimization problem under consideration, and the sufficient conditions worked out to guarantee the differentiability of the composition in (2).
In the context of point cloud inference considered by [GHO16], the positions of points in a fixed Euclidean space form the parameter space M, and the resulting Rips filtration (resp. Alpha filtration) of the total complex on the point cloud is the parametrization F . The loss function L is given by the least-squares approximation of a fixed barcode. By developing a clear functional point of view on the connection between the barcode of the Rips or Alpha filtration and the positions of the points in the cloud, based on lifts to Euclidean space, the authors show that L is differentiable wherever the pairwise distances between points in the cloud are distinct. The approach is further refined by [DC19], where it is observed that the parametrization F is a subanalytic map, which implies that the barcode-valued map admits subanalytic (hence generically differentiable) lifts. In turn, this fact is leveraged to show that any probability measure with a density w.r.t. the Hausdorff measure on M induces an expected persistence diagram (viewed as a measure in the plane) with a density w.r.t. the Lebesgue measure.
In many applications, F parametrizes lower-star filtrations, i.e. filter functions induced by their restrictions to the vertices of K [BGGSSG20, CNBW19, GNDS20, HGR`20, HLSC19,PSO18]. In [PSO18], the problem of shape matching is cast into an optimization problem involving the barcodes of the shapes. [CNBW19] uses the degree-0 persistent homology as a regularizer for classifiers. Similarly, [HLSC19] proposes a persistence based regularization as an additional loss for deep learning models in the context of image segmentation. In [HGR`20], a dataset of graphs is seen as part of a bigger common simplicial complex, which allows to learn a filter function which is shared across the Weighted sum of persistences Point cloud continuation [GHO16] Least-squares approximation of a fixed barcode Point clouds determining a Rips filtration Distinct pairwise distances between points Table 1: Current frameworks for differentiating the composition in (2). The first column lists the targeted applications. The second and third columns show the choices of loss function L and parametrization F . The differentiability of L˝Dgm p˝F is guaranteed under the conditions listed in the fourth column.
whole dataset. These contributions require the differentiability of (2), and they show that it holds whenever the filter function f θ is injective over the vertex set.
Functions on a grid are used in [BGGSSG20] to tackle the problem of surface reconstruction. These functions are sums of gaussians whose means and variances are parameters one wants to optimize according to an objective/loss that depends on the degree-1 persistent homology of the functions. [GNDS20] considers optimization problems involving persistence with many useful applications as in generative modelling, classification robustness, and adversarial attacks. Both contributions need to take the derivative of (2), and to do so, they require the existence of an inverse map taking interval endpoints in the persistence diagram Dgm p pf θ q to the corresponding vertices of K. This is a strictly weaker requirement than the injectivity of f θ , as used in the previous contributions, because an inverse map always exists (provided for instance by the standard reduction algorithm for persistent homology). However, per se, it does not guarantee the differentiability of the composition-see e.g. [HGR`20] for a counter-example.
This variety of applications motivates the search for a unified framework for expressing the differentiability of the arrows in diagrams of the form: Since the first appearance of this paper as a preprint, there have been novel applications of persistence differentiability in optimisation. For instance, the first author has developped a graph classification framework based on the Laplacian operator [YL21], applying the differentiability of the persistence map (Theorem 4.9) to the case of extended persistence. In addition, new heuristics to smooth and regularise loss functions as in Eq.
(3) improved the optimisation procedure for specific data science problems [CD20,SWB20]. Another strong guarantee is provided when the loss in Eq.
(3) is semi-algebraic (and more generally subanalytic or definable in some o-minimal structure), as then the classic stochastic gradient descent (SGD) algorithm converges to critical points [DDKL20]. The bridge between this result in non-smooth analysis and persistence based optimisation problems is made in [CCG`21], where sufficient conditions for loss functions as in Eq.
(3) to be semi-algebraic are given. The main results of [CCG`21] also derive from our general framework, see Remark 4.25.

Contributions and outline of the paper
Ultimately, our framework should make it possible to determine when and how maps between smooth manifolds M and N that factor through the space of barcodes can be differentiated: To achieve this goal, in Section 3 we define differentiability via lifts in full generality, thereby extending the approach initially proposed by [GHO16] for the specific case of parametrizations by Rips filtrations. Here we provide some of the details. As a space of multi-sets (assumed by default to have finitely many off-diagonal points), Bar does not naturally come equipped with a differential structure. However, it is covered by maps of the form:

R 2mˆRn
Bar Q m,n where R 2mˆRn can be thought of as the space of ordered barcodes with fixed number m (resp. n) of finite (resp. infinite) intervals, and where Q m,n is the quotient map modulo the order-turning vectors into multisets (Definition 3.1). Then, the map B : M Ñ Bar is said to be r-differentiable at parameter θ P M if it admits a local C r liftB into R 2mˆRn for some m, n P N: This means that the mapB tracks smoothly and consistently the points in the barcodes Bpθ 1 q, for θ 1 ranging over some open neighborhood U of θ. Dually, the map V : Bar Ñ N is r-differentiable at D P Bar if for every possible choice of m, n, the composition V˝Q m,n : R 2mˆRn Ñ Bar is C r on an open neighborhood of every pre-imageD of D: The choice of m, n and pre-imageD of D should be thought of as the type of perturbation we allow around D. Thus, essentially, V is asked to be smooth with respect to any finite perturbation of D. In section 3.5 we connect these definitions to the theory of diffeological spaces, showing that our two definitions of differentiability for maps B and V are dual to each other and make the barcode space Bar a diffeological space.
We then define the differentials of the maps B and V , given simply by the differentials of the liftB : M Ñ R 2mˆRn (for B) and of the composition V˝Q m,n on the pre-imageD P R 2mˆRn (for V ). Although these differentials taken individually are not defined uniquely, their corresponding diagrams (4) and (5) combine together as follows: implying that the composition V˝B " pV˝Q m,n q˝B is a C r map between smooth manifolds, whose derivative is obtained by composing the differentials of B and V , and this regardless of the choice of lift and pre-image. This is our analogue of the chain rule in ordinary differential calculus (Proposition 3.14).
In Sections 4 and 6, we focus on barcode-valued maps B : M Ñ Bar arising from filter functions on fixed smooth manifolds or simplicial complexes. These maps are usually not differentiable everywhere on their domain. However, motivated by the aforementioned applications, we seek conditions under which B is differentiable almost everywhere on M. A natural approach for this would be to use Rademacher's theorem [Fed69,Thm. 3.1.6], as we know that B is Lispchitz continuous by the Stability Theorem of persistent homology [BL15,CdSGO16,CSEH07]. However, this approach has several important shortcomings: • it depends on a choice of measure on M; • it calls for a generalization of Rademacher's theorem to maps taking values in arbitrary metric spaces, and to the best of our knowledge, existing generalizations only provide directional metric differentials (see e.g. [Pan89]); • more fundamentally, it is not constructive and therefore does not provide formulae for the differentials; • finally, in the context of optimization, it is important to guarantee the existence of differentials/gradients in an open neighborhood of the considered parameter θ, and not just in a full-measure subset.
We therefore propose to follow a different approach, seeking conditions that ensure the differentiability of B on a generic (i.e. open and dense) subset of M, with explicit differential.
Our first scenario (Section 4) considers a parametrization F : M ÝÑ R K of filter functions on a fixed simplicial complex K. Given a homology degree p ď d, where d is the maximal simplex dimension in K, the barcode-valued map B decomposes as B " Dgm p˝F , and in Theorem 4.9 we show that B is r-differentiable on a generic subset of M whenever F is C r over M or a generic subset thereof. The proof relies on the fact that the pre-order on the simplices of K induced by the values assigned by the filter function F pθq is generically constant around θ in M.
We then relate the diffential of B to those of F in Proposition 4.14, yielding a closed formula that can be leveraged in practical implementations. Finally, we study the behavior of B at singular points by means of a stratification of the parameter space M, whereby the top-dimensional strata are the locations where B is differentiable, and the lower-dimensional strata characterize the defect of differentiability of B. We show in Theorem 4.19 that we can define directional derivatives along each incident stratum at any given point θ P M. We also show that the barcode valued map can be globally lifted and expressed as a permutation map on each stratum (Corollary 4.24).
In Section 5 we illustrate the impact of our framework on a series of examples of parametrizations coming from earlier work, including lower-star filtrations, Rips filtrations and some of their generalizations. For each example, we examine the differentiability of the barcode-valued map and, whenever readily computable, we give the expressions of its differential. This allows us to recover the differentiability results from earlier work in a principled way.
Our second scenario (Section 6) considers a parametrization F : M ÝÑ C 8 pX , Rq of smooth filter functions on a fixed smooth compact d-dimensional manifold X . In this scenario, given a parameter θ P M, the barcode-valued map B computes all the barcodes of f θ at once, and collates them in a vector of barcodes: We show that B is 8-differentiable at any parameter θ such that f θ is Morse with distinct critical values (Theorem 6.1).
The key insights are: on the one hand, that at any such parameter θ the implicit function theorem allows us to smoothly track the critical points of f θ 1 as θ 1 ranges over a small enough open neighborhood around θ; on the other hand, that the Stability Theorem provides a consistent correspondence between the critical points of f θ 1 and the interval endpoints in its barcodes.
In Section 7 we look at examples of classes of maps V : Bar Ñ N . We first consider persistence images [AEK`17] and more generally linear representations of barcodes, as an illustration of our framework on barcode vectorizations. We show that persistence images and linear representations are 8-differentiable under suitable choices of weighting function (Propositions 7.3 and 7.5). We then consider the case where V : Bar Ñ R is the bottleneck or Wasserstein distance to a fixed barcode, and show it is semi-algebraic in a suitable sense (Proposition 7.7), which is useful in a context of optimisation. We then focus on the bottleneck distance to a fixed barcode D 0 , which we believe can be of interest in the context of inverse problems. We show that this distance is differentiable on a generic subset of Bar (Propositions 7.9 and B.1).
Finally, throughout the paper we sprinkle our exposition with examples of parametrizations and loss functions that illustrate our results and demonstrate their potential for applications.

Preliminary notions
Throughout the paper, vector spaces and homology groups are taken over a fixed field k, omitted in our notations whenever clear from the context. As much as possible, we keep separate terminologies for different notions of differentiability, for instance: maps from or to the space of barcodes are called r-differentiable when maps between manifolds are simply called C r . The only exception to this rule is the term smooth for maps, which has a versatile meaning that should nonetheless always be clear from the context.

Persistence modules and persistent homology
Definition 2.1. A persistence module V is a functor from the poset pR, ďq to the category Vect k of vector spaces over k.
In other words, a persistence module is a collection V " tV t , v s,t : V s Ñ V t u ps,tqPR 2 ,sďt of vector spaces V t and linear maps v s,t , such that v t,t " id Vt for all t P R and v s,t˝vr,s " v r,t for all r ď s ď t P R. We say that V is pointwise finite-dimensional (or pfd for short) if every V t is finite-dimensional. Unless otherwise stated, persistence modules in the following will be pfd. Definition 2.2. A morphism η : V Ñ W between two persistence modules is a natural transformation between functors.
In other words, writing V " tV t , v s,t u sďt and W " tW t , w s,t u sďt , a morphism η : V Ñ W is a collection of linear maps tη t : V t Ñ W t u tPR such that the following diagram commutes for all s ď t: We say that η is an isomorphism of persistence modules if all the η t are isomorphisms of vector spaces. We denote by Pers the category of persistence modules. Pers is an abelian category, so it admits kernels, cokernels, images and direct sums, which are defined pointwise. By Crawley-Bovey's Theorem [CB15], we know that persistence modules essentially uniquely decompose as direct sums of elementary modules called interval modules. The interval module I J associated to an interval J of R is defined as the module with copies of the field k over J and zero spaces elsewhere, the copies of k being connected by identity maps. Theorem 2.3. For any persistence module V, there is a unique multi-set J of intervals of R such that Persistence modules of particular interest are the ones induced by the sub-level sets of real-valued functions. Definition 2.4. Let f : X Ñ R be a real-valued function on a topological space. Write X t :" f´1pp´8, tsq for the closed sublevel set of f at level t P R. Given p P N, the sublevel set persistent homology of f in degree p is the (non-necessarily pfd) persistence module H p pf q defined by: • the vector spaces tH p pX t qu tPR , where H p is the singular homology functor in degree p with coefficients in k; • the linear maps tv s,t : H p pX s q Ñ H p pX t qu sďt induced by inclusions X s ãÑ X t .
In the following we restrict our focus to finite-type persistence modules induced by tame functions, defined as follows: Definition 2.5. A persistence module V is of finite type if it admits a decomposition into finitely many interval modules. Definition 2.6. A function f : X Ñ R is tame if its persistent homology modules in any degree are of finite type.
In particular, filter functions on a finite simplicial complex (see below) and Morse functions on a smooth manifold (see Section 2.3) are tame. Definition 2.7. Let K be a finite simplicial complex. A filter function f : K Ñ R is a function that is monotonous with respect to inclusions of faces in K, i.e. f pσq ď f pσ 1 q for all σ Ď σ 1 P K. This implies in particular that every sublevel set K t :" tσ P K | f pσq ď tu is a sub-complex of K.

Persistence barcodes / diagrams
Given a decomposition of a finite-type persistence module V as in (6), the (finite) multi-set J is called the barcode of V.
An alternate representation is as a (finite) multiset B of points in the plane, where each interval J P J is mapped to the point pinf J, sup Jq. To this multiset of points we add ∆ 8 , that is the multiset containing countably many copies of the diagonal ∆ :" tpb, bq | b P Ru, to obtain the so-called persistence diagram of V. When V is the sublevel set persistent homology of a tame function f in degree p, we denote by Dgm p pf q its persistence diagram. Persistence diagrams can also be defined independently of persistence modules as follows: , with countably many copies of the diagonal ∆. The set of persistence diagrams is denoted by Bar.
From now on we also use the terminology barcodes for persistence diagrams. Following this terminology, we also call intervals the points in a persistence diagram. Points lying on the diagonal ∆ are qualified as diagonal, the others are qualified as off-diagonal. Remark 2.9. In the above definitions we follow the literature on extended persistence, in which persistence diagrams can have points everywhere in the extended plane RˆR. This is because our framework extends naturally to that setting. Note also that, in the literature, the diagonal is sometimes not included in the diagrams. Here we are including it with infinite multiplicity. This is in the spirit of taking the quotient category of observable persistence modules, as defined by [CCBdS16]. Definition 2.10. Given two barcodes D, D 1 P Bar, viewed as multisets, a matching is a bijection γ : D Ñ D 1 . The cost of γ is the quantity cpγq :" sup xPD }x´γpxq} 8 PR.
We denote by ΓpD, D 1 q the set of all matchings between D and D 1 . Definition 2.11. The bottleneck distance between two barcodes D, D 1 P Bar is Given q P R˚, a slight modification of the matching cost yields the q-th Wasserstein distance on barcodes as introduced in [CSEHM10]: Since we include all points in the diagonal with infinite multiplicity in our definition of barcodes, d 8 is a true metric 1 and not just a pseudo-metric. Indeed, for any D, D 1 P Bar, we have d 8 pD, D 1 q " 0 ñ D " D 1 . We call bottleneck topology the topology induced by d 8 , which by the previous observation makes Bar a Hausdorff space.
A key fact is the Lipschitz continuity of the barcode function, known as the Stability Theorem [BL15, CdSGO16, CSEH07]: Theorem 2.12. Let f, g : X Ñ R be two real-valued functions with well-defined barcodes. Then, d 8 pDgm p pf q, Dgm p pgqq ď }f´g} 8 .
Note that the assumptions in the theorem are quite general and hold in our cases of interest: tame functions on a compact manifold, and filter functions on a simplicial complex.

Morse functions
Morse functions are a special type of tame functions, for which there is a bijective correspondence between critical points in the domain and interval endpoints in the barcode. This correspondence, detailed in Proposition 2.14, will be instrumental in the analysis of Section 6. For a proper introduction to Morse theory, we refer the reader to [Mil63]. Definition 2.13. Given a smooth d-dimensional manifold X , a smooth function f : X Ñ R is called Morse if its Hessian at critical points (i.e. points where the gradient of f vanishes) is nondegenerate.
Note that we do not assume a priori that the values of f at critical points (called critical values) are all distinct. For such a value a, we call multiplicity of a the number of critical points in the level-set f´1paq. We also introduce the notation Critpf q to refer to the set of critical points, which is discrete in X . In particular, if X is compact, which will be the case in this paper, Critpf q is finite. The number of negative eigenvalues of f at a critical point x is called the index of x.
Proposition 2.14. Assume X is compact and all the critical values of f have multiplicity 1. Denote by Epf q the multiset of finite endpoints of off-diagonal intervals (including the left endpoints of infinite intervals) of Dgm 0 pf q\...\Dgm d pf q.
Then, f induces a bijection Critpf q Ñ Epf q.
This result is folklore, and we give a proof only for completeness.
Proof. Let a ď b be real numbers. Write X a for the sublevel set f´1pp´8, asq. If ra, bs contains a unique critical value c of f , then X b has the homotopy type of X a glued together with a cell e p of dimension p, where p is the index of the unique critical point x associated to c [Mil63]. Therefore, H˚pX b , X a q is trivial except for˚" p where it is spanned by the homology class of e p . This does not depend on the choice of a, b surrounding c and sufficiently close to it. Then, using the long exact sequence in homology, we deduce that either there is one birth in degree p at value c in the persistent homology module, or there is one death in degree p´1. Hence, c is either a left endpoint of an interval of Dgm p pf q, or a right endpoint of an interval of Dgm p´1 pf q. In either case, we can define the map x Þ Ñ f pxq for any x P Critpf q, and we have just shown that its codomain is indeed Epf q. The map is injective because the critical values of f have multiplicity 1 by assumption. We now show it is onto. Let a P R be a non-critical value of f . For any (small enough) ε, η ą 0, the interval ra´η, a`εs contains no critical value of f , therefore X a`ε deform retracts onto X a´η , thus implying that the inclusions H p pX a´η q Ñ H p pX a`ε q are identity maps for any homology degree p. By the decomposition Theorem 2.3, this implies that a cannot be an endpoint of an interval summand, i.e. a R Epf q.
The assumption that each critical value of f has multiplicity 1 is superfluous in Proposition 2.14, if we allow the correspondence map to match trivial intervals. Let ra, bs be an interval containing a unique critical value c. One can still use Morse theory and glue as many critical cells e p to X a as there are critical points in f´1pcq in order to obtain a CW structure on X b from the one of X a . Considering the different critical cells, we know exactly the ranks of the morphisms H p pX a q Ñ H p pX b q induced by inclusions in each homology degree p.

Diffeology theory
Diffeology theory provides a principled approach to equip a set with a smooth structure. We use some concepts of the theory in Section 3.5, where we equip the set Bar of barcodes with a diffeology and identify the resulting smooth maps. We refer the reader to [IZ13] for a detailed introduction to the material presented below. In the following, we call domain any open set in any arbitrary Euclidean space.
Definition 2.15. Given a non-empty set S, a diffeology is a collection D of pairs pU, P q, called plots, where U is a domain and P : U Ñ S is a map from U to S, satisfying the following axioms: (Covering) For any element s P S and any integer n P N, the constant map x P R n Þ Ñ s P S is a plot.
(Locality) If for a pair pU, P q we have that, for any x P U there exists an open neighborhood U 1 Ď U of x such that the restriction pU 1 , P |U 1 q is a plot, then pU, P q itself is a plot.
(Smoothness compatibility) For any plot pU, P q and any smooth map F : W Ñ U where W is a domain, the composition pW, P˝F q is a plot.
If a set S comes equipped with a diffeology D, then it is called a diffeological space. We think of a diffeological space S as a space where we impose which functions, the plots, from a manifold to S, are smooth. Notice that any set can be made a diffeological space by taking all possible maps as plots. This is the coarsest diffeology on S, where D is said to be finer than the diffeology D 1 if D Ă D 1 , and coarser if the converse inclusion holds. 2 The prototypical diffeological space is the Euclidean space R n with the usual smooth maps from domains to R n as plots. Definition 2.16. A morphism f : S Ñ S 1 , or smooth map, between two diffeological spaces S and S 1 , is a map such that for each plot P of S, f˝P is a plot of S 1 . f is called a diffeomorphism if it is a bijection and f´1 : S 1 Ñ S is smooth. A map f : A Ñ S 1 , where A Ď S, is locally smooth if for any plot P of S, f˝P |P´1pAq is a plot of S 1 . f is a local diffeomorphism if it is a bijection onto its image and if f´1 is locally smooth as a map S 1 Ě f pAq Ñ S.
Obviously, identities are smooth, and smooth maps compose together into smooth maps, therefore we can consider the category Diffeo of diffeological spaces. Finite dimensional smooth manifolds with or without boundaries and corners, Fréchet manifolds and Frölicher spaces, viewed as diffeological spaces with their usual smooth maps, form strict subcategories of Diffeo. In fact, finite dimensional smooth manifolds can be defined in the context of diffeology as follows: Definition 2.17. A diffeological space M is a n-dimensional diffeological manifold if it is locally diffeomorphic to R n at every point in M. Theorem 2.18 ([IZ13, § 4.3]). Every n-dimensional smooth manifold M is an n-dimensional diffeological manifold once equipped with the diffeology given by the smooth maps U Ñ M from arbitrary domains U . Conversely, every n-dimensional diffeological manifold is an n-dimensional smooth manifold.
One appealing feature of Diffeo, compared to the category of smooth manifolds for instance, is that it is closed under usual set operations-here we only consider coproducts and quotients: Definition 2.19. For an arbitrary family of diffeological spaces tpS j , D j qu jPJ , the sum diffeology on Ů jPJ S j is the finest diffeology making the injections S i Ñ Ů jPJ S j smooth. Definition 2.20. For a diffeological space pS, Dq and an equivalence relation " on S, the quotient diffeology on S{" is the finest diffeology making the quotient map S Ñ S{" smooth.

Stratified manifolds
Stratified manifolds play a role in Section 4.3 of this paper. For background material on the subject, see e.g. [Mat12].  (Condition b) Consider a pair of strata pM 1 , M 2 q and an element θ P M 1 . If there are sequences of points pθ 1 k q kPN and pθ 2 k q kPN lying in M 1 and M 2 respectively, both converging to θ, such that the line pθ 1 k , θ 2 k q (defined in some local coordinate system around θ) converges to some line l and T θ 2 k M 2 converges to some flat, then this flat contains l.
Stratified maps are those that behave nicely with respect to stratifications. Here we only use a subset of the axioms they satisfy, hence we talk about weakly stratified maps. Definition 2.22. Let M, N be stratified manifolds. A map f : M Ñ N is weakly stratified if the pre-images f´1pN 1 q, for any stratum N 1 P S N , is a union of strata in S M .

Differentiability of barcode valued maps
Throughout this section, M denotes a smooth finite-dimensional manifold without boundary, which may or may not be compact. Our approach to characterizing the smoothness of a barcode valued map is to factor it through the bundle of ordered barcodes: Definition 3.1. For each choice of non-negative integers m, n, the space of ordered barcodes with m finite bars and n infinite ones is R 2mˆRn , equipped with the Euclidean norm and the resulting smooth structure. The corresponding quotient map Q m,n : R 2mˆRn Ñ Bar quotients the space by the action 3 of the product of symmetric groups S mˆSn , that is: for any ordered barcodeD " pb 1 , d 1 , ..., b m , d m , v 1 , ..., v n q P R 2mˆRn , One can think of an ordered barcodeD P R 2mˆRn as a vector describing a persistence diagram with at most m bounded off-diagonal points and exactly n unbounded points. The former have their coordinates encoded in the adjacent pairs of the 2m first components inD, while the latter have the abscissa of their left endpoint encoded in the last n components ofD. The quotient map Q m,n forgets about the ordering of the bars in the barcodes. So far Q m,n is merely a map between sets, and it is natural to ask whether it is regular in some reasonable sense: Proposition 3.2. For any m, n P N 2 , Q m,n is 1-Lipschitz when Bar is equipped with the bottleneck topology.
Proof. For any two elementsD 1 ,D 2 P R 2mˆRn , there is an obvious matching γ on their images Q m,n pD 1 q, Q m,n pD 2 q given by matching the components of the vectorsD 1 andD 2 entry-wise. The cost of this matching is then bounded above by the supremum norm ofD 1´D2 , by the definition of the matching cost cpγq. In turn, the supremum norm is bounded above by the 2 norm.
We then say that a barcode valued map is smooth if it admits a smooth lift into the space of ordered barcodes for some choice of m, n: Remark 3.4 (Locally finite number of off-diagonal points). If a function B as above is r-differentiable at x P M, then locally for any x 1 around x we can upper-bound the number of off-diagonal points arising in Bpx 1 q by m`n. Notice that off-diagonal points can possibly appear in Bpx 1 q and become part of the diagonal ∆ in Bpxq, which is to say that Defnition 3.3 does not restrict the function B to locally consist in a fixed number of off-diagonal points. Informally, in analogy with the fact that a barcode has finitely many off-diagonal points, our definition of smoothness allows finitely many appearances or disappearances of off-diagonal points in the neighborhood of a barcode. Remark 3.5 (0-differentiability is stronger than bottleneck continuity). If B : M Ñ Bar is 0-differentiable, then B is continuous when Bar is given the bottleneck topology. This comes from the Lipschitz continuity of Q m,n (Proposition 3.2) and the fact that continuity is stable under composition. The converse is false, because, on the one hand, if B is 0-differentiable then locally the number of off-diagonal points in the image of B is uniformly bounded (see the previous remark), while on the other hand, the number of off-diagonal points appearing in barcodes in any given open bottleneck ball is arbitrarily large.
Definition 3.6. Let B : M Ñ Bar be 1-differentiable at some x, andB : U Ñ R 2mˆRn be a C 1 lift of B defined on an open neighborhood U of x. The differential (or derivative) d x,B B of B at x with respect toB is defined to be the differential ofB at x: Post-composing with the quotient map, we can see Q m,n˝d x,B B : T x M Ñ Bar as a multi-set of co-vectors, one above each off-diagonal point of Bpxq (plus some distinguished diagonal points), describing linear changes in the coordinates of the points of Bpxq under infinitesimal perturbations of x. In this respect, the spaces of ordered barcodes R 2m`n play the role of tangent spaces over Bar. For practical computations, it can be convenient to work with an alternate yet equivalent notion of differentiability, based on point trackings: Definition 3.7. Let B : M Ñ Bar be a barcode valued map. Let x P M and r P N Y t`8u. A C r local coordinate system for B at x is a collection of maps tb i , d i : U Ñ Ru iPI and tv j : U Ñ Ru jPJ for finite sets I, J defined on an open neighborhood U of x, such that: (Tracking) For any x 1 P U we have the multi-set equality Thus, in a local coordinate system, we have maps b i , d i (resp. v j ) that track the endpoints of bounded (resp. unbounded) intervals in the image barcode through B. We will often abbreviate the data of a local coordinate system of B at x by T " pU, tb i , d i u iPI , tv j u jPJ q.
Our two notions of differentiability are indeed equivalent: Proposition 3.8. Let B : M Ñ Bar be a barcode valued map and x P M. Then B is r-differentiable at x if and only if it admits a C r local coordinate system at x. Specifically, post-composing a C r local liftB : U Ñ R 2mˆRn around x with the quotient map Q m,n yields a C r local coordinate system, and conversely, fixing an order on the functions of a C r local coordinate system yields a C r local lift.
.., v n px 1 qq :"Bpx 1 q to get a local coordinate system, which is C r over U asB is. pðq Let T " pU, tb i , d i u iPI , tv j u jPJ q be a C r local coordinate system for B at x. Set m " |I| and n " |J|, and fix two arbitrary bijections s : t1, ..., mu Ñ I and t : t1, ..., nu Ñ J. Then the mapB : As a map valued in a Euclidean space,B is C r because all its coordinate functions are.
Remark 3.9 (Non uniqueness of differentials). It is important to keep in mind that the differential of B at x is not uniquely defined, as it depends on the choice of local lift. Indeed, for two distinct liftsB,B 1 of B at x, we usually get distinct differentials dB x,B , dB x,B 1 . For instance, ifB 1 is obtained fromB by appending an extra pair of coordinates of the form pf, f q, where f is a smooth real function, then dB x,B 1 takes its values in a different codomain than that of dB x,B . Note that this will not be an issue in the rest of the paper, as any choice of differential will yield a valid chain rule (Section 3.3).

Differentiability of maps defined on barcodes
Let N be a smooth finite-dimensional manifold without boundary. Our notion of differentiability for maps V : Bar Ñ N is in some sense dual to the one for maps B : M Ñ Bar, as will be justified formally in the next section. Definition 3.10. Let V : Bar Ñ N be a map on barcodes. Let D P Bar and r P N Y t`8u. V is said to be r-differentiable at D, if for all integers m, n and all vectorsD P R 2mˆRn such that Q m,n pDq " D, the map V˝Q m,n : R 2mˆRn Ñ N is C r on an open neighborhood ofD.
Notice that for each choice of m, n we have a unique map V˝Q m,n , and we must check its differentiability at all the (possibly many) distinct pre-imagesD of D and for all m, n. One can think of a choice of m, n and pre-imageD of D as a choice of tangent space of Bar at D. Example 3.11 (Total persistence function). Let V : Bar Ñ R be defined as the sum, over bounded intervals pb, dq in a barcode D, of the length pd´bq. Given D P Bar and an ordered barcodeD P R 2m`n such that Q m,n pDq " D, the map V˝Q m,n is a linear form and in particular is of class C 8 atD. Explicitly, we have The relationship between 0-differentiability and the bottleneck continuity for maps V is the opposite to the one that holds for maps B (recall Remark 3.5): Remark 3.12 (Bottleneck continuity is stronger than 0-differentiability). If V : Bar Ñ N is continuous when Bar is equipped with the bottleneck topology, then V is 0-differentiable. This is because the quotient map Q m,n is continuous (Proposition 3.2) and the composition of continuous maps is continuous. The converse is false, as seen for instance when taking V to be the total persistence function: although 0-differentiable (because 8-differentiable) on Bar, V is not continuous in the bottleneck topology as it is unbounded in any open bottleneck ball.

Chain rule
We now combine the previous definitions to produce a chain rule. Proposition 3.14. Let B : M Ñ Bar be r-differentiable at x P M, and V : Bar Ñ N be r-differentiable at Bpxq. Then: (i) V˝B : M Ñ N is C r at x as a map between smooth manifolds; (ii) If r ě 1, then for any local C 1 liftB : U Ñ R 2m`n of B around x we have: The meaning of this formula is that, even though the differentials of B and of V may depend on the choice of liftB : M Ñ R 2m`n , their composition does not, and in fact it matches with the usual differential of V˝B as a map between smooth manifolds.
This implies that the composition V˝B| U " pV˝Q m,n q˝B is C r at x, and therefore that V˝B itself is C r at x since U is open. This proves (i). The formula of (ii) follows then from applying the usual chain rule to pV˝Q m,n q andB, which are C 1 maps between smooth manifolds without boundary.
Example 3.15. In [HGR`20], given a C 8 neural network architecture F 0 : R N Ñ R K0 valued in the set of functions over the vertices of a fixed graph K, the optimization pipeline requires taking the gradient of the following loss function: where s : R 2 Ñ R is a fixed smooth map, and Dgm p pF 0 pθqq is the degree-p persistence diagram associated to the lower star filtration induced by F 0 pθq on K (see Section 5.1 dedicated to the full analysis of lower star filtrations). We may see L as the composition: On the one hand, B is 8-differentiable at every θ where F 0 pθq is injective, as will be detailed in Section 5.1. On the other hand, V is 8-differentiable everywhere on Bar, a fact obtained exactly as in the case of the total persistence function of Example 3.11. By the chain rule (Proposition 3.14), we deduce that the loss L is smooth at every θ where F 0 pθq is injective. Thus we recover the differentiability result of [HGR`20]. In fact, the upcoming Theorem 4.9 ensures that B is 8-differentiable over an open dense subset of R N , and therefore so is L by the chain rule.

Higher-order derivatives
The notions of derivatives introduced in Definitions 3.6 and 3.13 extend naturally to higher orders. For simplicity, we place ourselves in the Euclidean setting, letting M " R N and N " R N 1 for some N, N 1 P N. Definition 3.16. Let B : R N Ñ Bar be r-differentiable at some x, andB : U Ñ R 2mˆRn be a C r lift of B defined on an open neighborhood U of x. The r-th differential (or derivative) of B at x with respect toB is defined to be the r-th Fréchet differential ofB at x: d r xB : pR N q r Ý Ñ R 2mˆRn .

Dually:
Definition 3.17. Let V : Bar Ñ R N 1 be r-differentiable at D P Bar, andD P R 2m`n be a pre-image of D via Q m,n . The r-th differential (or derivative) of V at D with respect toD is the r-th Fréchet differential of V˝Q m,n atD: Note that, given maps B : R N Ñ Bar and V : Bar Ñ R N 1 that are r-differentiable at x and Bpxq respectively, the chain rule of Section 3.3 adapts readily to higher-order derivatives of B˝V at x.
Meanwhile, we get a natural Taylor expansion of B at x with respect toB: Proof. This follows from applying the standard Taylor-Young theorem toB, then post-composing by Q m,n -which is 1-Lipschitz by Proposition 3.2.
To our knowledge, there is in general no equivalent of this result for the map V , due to the lack of a Lipschitz-continuous section of Q m,n .

The space of barcodes as a diffeological space
In this subsection, we detail how Bar, when viewed as the quotient of a disjoint union of Euclidean spaces, is canonically made into a diffeological space, as defined in Section 2.4. We then show that the resulting notions of diffeological smooth maps from and to Bar coincide with the definitions 3.3 and 3.10 of differentiability we chose for maps from and to Bar in the previous sections, thus making these two definitions dual to each other.
As a set, Bar is isomorphic to´Ů m,nPN R 2m`n¯{ ", where " is the transitive closure of the following relations for m, n ranging over N: • For any permutations π, τ of t1, ..., mu and t1, ..., nu respectively, rpb i , d i q m i"1 , pv j q n j"1 s " rpb πpiq , d πpiq q m i"1 , pv τ pjq q n j"1 s, which indicates that persistence diagrams are multisets (i.e. intervals are not ordered); • Any element rpb i , d i q m i"1 , pv j q n j"1 s P R 2m`n such that one of the first m adjacent pairs pb i , d i q satisfies b i " d i is equivalent to the element of R 2pm´1q`n obtained by removing pb i , d i q. These identifications correspond to quotienting multisets by the diagonal ∆.
Since the Euclidean spaces R 2m`n are equipped with their Euclidean diffeologies, we obtain a canonical diffeology DpBarq over Bar from Definitions 2.19 and 2.20. The plots of DpBarq can be concretely characterized as follows: In other words, a plot in DpBarq is an 8-differentiable map from a domain U to Bar.
Proof. Note that the characterization of the quotient diffeology, as given in Definition 2.20, is in fact the characterization of the so-called push-forward diffeology induced by the quotient map-see [ also admits a lift to Ů m,nPN R 2m`n . Indeed, calling D the unique barcode in the image of B |W , we can choose one pre-imageD of D in one of the spaces of ordered barcodes R 2m`n , then takeB to be the constant map W Ñ tDu. that matches with B |W once post-composed with the quotient map modulo ". In turn, by the characterization of the sum diffeology in [IZ13, § 1.39],B is a plot of Ů m,nPN R 2m`n if and only if, for any x P W , there is an open neighborhood V Ď W of x and a pair of indices pm, nq such that the restrictionB |V maps into R 2m`n and is in fact a plot of R 2m`n . Equivalently, we have B |V " Q m,n˝B|V , whereB |V is of class C 8 (since the spaces of ordered barcodes are equipped with their canonical Euclidean diffeologies).
Corollary 3.20. The smooth maps in Diffeo from a smooth manifold M without boundary (equipped with the diffeology from Theorem 2.18) to the diffeological space Bar are exactly the 8-differentiable maps from M to Bar.
Proof. Let B : M Ñ Bar be a smooth map in Diffeo. For any plot φ : U Ñ M, the composition B˝φ is a plot in DpBarq, therefore it locally rewrites as Q m,n˝B for some C 8 liftB, by Proposition 3.19. Choosing φ to be a local coordinate chart, we then locally have B " Q m,n˝B˝φ´1 , which means that B is 8-differentiable. Conversely, if B is 8-differentiable, it locally rewrites as B " Q m,n˝B , hence for any plot φ : U Ñ M the composition B˝φ locally rewrites as Q m,n˝B˝φ and therefore is a plot in DpBarq by Proposition 3.19.
Dually: Corollary 3.21. The smooth maps in Diffeo from the diffeological space Bar to a smooth manifold N without boundary (equipped with the diffeology from Theorem 2.18) are exactly the 8-differentiable maps from Bar to N .
Bar is a plot, then it locally rewrites as Q m,n˝B for some C 8 liftB, therefore V˝B is locally of the form pV˝Q m,n q˝B, which is of class C 8 as a map between manifolds by the chain rule. Thus, V˝B is a plot, and therefore V is smooth in Diffeo.
Conceptually, we have made Bar into a diffeological space by viewing it as the quotient of the direct limit of the spaces of ordered barcode. Then, 8-differentiable maps are simply morphisms in Diffeo from or to smooth manifolds, rather than maps satisfying the a priori unrelated definitions 3.3 and 3.10. More generally, by seeing Bar as one object in Diffeo where morphisms can come in or out, we have notions of smooth maps from or to Bar with respect to any other diffeological space. For instance, a map f : Bar Ñ Bar is smooth if and only if all the maps f˝Q m,n , for varying integers m, n, are 8-differentiable (the proof is left as an exercise to the reader). Note however that diffeology does not characterize the r-differentiable maps for finite r nor the maps that are differentiable only locally, two concepts that are prominent in our analysis.

The case of barcode valued maps derived from real functions on a simplicial complex
In this section we consider barcode valued maps B p : M Ñ Bar that factor through the space R K of real functions on a fixed finite abstract simplicial complex K: In other words, we consider barcodes derived from real functions on K. Note that Dgm p , the barcode map in degree p, is only defined on the subspace of filter functions, i.e. functions K Ñ R that are monotonous with respect to inclusions of faces in K. This subspace is a convex polytope bounded by the hyperplanes of equations f pσq " f pσ 1 q for σ Ĺ σ 1 P K.
From now on, we consistently assume that F takes its values in this polytope. Example 4.1 (Height filters). Given an embedded simplicial comple K Ď R d , let M " S d´1 and F : θ Þ Ñ pσ P K Þ Ñ max xPσ xθ, xyq. The filter functions considered here are the height functions on K, parametrized on the unit sphere S d´1 by the map F .
By analogy with the previous example, we generally call F the parametrization associated to B, although it may not always be a topological embedding of M into R K (it may not even be injective). We also call M the parameter space, and use the generic notation θ to refer to an element in M.
As we shall see in Section 4.1, a local coordinate system for the map B p at θ P M can be derived when the order of the values of the filter function F pθq remains constant locally around θ. For this purpose we introduce the following equivalence relation on filter functions K Ñ R: Definition 4.2. Given a filter function f : K Ñ R, the increasing order of its values induce a pre-order on the simplices of K. Two filter functions f, g are said to be ordering equivalent, written f " g, if they induce the same pre-order on K. This relation is an equivalence relation on filter functions, and we denote by rf s the equivalence class of f . The (finite) set of equivalence classes is denoted by ΩpR K q.
In order to compare barcodes across an entire equivalence class of functions, we introduce barcode templates as follows: Given a filter function f P R K and a homology degree 0 ď p ď d, a barcode template pP p , U p q is composed of a multiset P p of pairs of simplices in K, together with a multiset U p of simplices in K, such that: Note that we do not require a priori that dim σ " p and dim σ 1 " p`1.
Proposition 4.4. For any filter function f P R K and homology degree 0 ď p ď d, there exists a barcode template Proof. Consider the interval decomposition H p pf q -' JPJ I J of the p-th persistent homology module of f . Note that every interval endpoint in the decomposition corresponds to the f -value of some simplex of K (since the persistent homology module has internal isomorphisms in-between these values). For every bounded interval J with endpoints b, d P R choose an element pσ J , σ 1 J q in f´1pbqˆf´1pdq Ď KˆK, then form the multiset P p :" tpσ J , σ 1 J q | J P J boundedu. Meanwhile, for every unbounded interval J with finite endpoint v P R choose an element σ J in f´1pvq, then form the multiset U p :" tσ J | J P J unboundedu.
Barcode templates get their name from the fact that they are an invariant of the ordering equivalence relation ": Proof. The operation that takes a persistence module to its shift by h is an endofunctor of Pers which commutes with direct sums. In particular it preserves isomorphisms.
Proof of Proposition 4.5. Let f, f 1 be two ordering equivalent filter functions. Since f " f 1 , we have f pσq " f pσ 1 q ñ f 1 pσq " f 1 pσ 1 q for any pair of simplices σ, σ 1 P K. Therefore the map h : f pσq P f pKq Þ Ñ f 1 pσq P f 1 pKq is well-defined. Furthermore, h is an increasing function and we extend it monotonously and continuously over all R. Then, by the reparametrization Lemma 4.6, any barcode template of f is also a barcode template of f 1 .

Generic smoothness of the barcode valued map
We now state our first significant results (one local and the other global) about the differentiability of the map B p in the context of this section. Equipping R K with the usual Euclidean norm, we assume that the parametrization F is of class C r as a map M Ñ R K . Under this hypothesis, we show that B p is r-differentiable in the sense of Definition 3.3 on a generic (open and dense) subset of M. The intuition behind these results is that, whenever the filter functions F pθ 1 q are all ordering equivalent in a neighborhood of θ, we can pick a barcode template that is consistent across all filter functions F pθ 1 q in this neighborhood (by Propositions 4.4 and 4.5) and the Equation (8) then behaves like a local coordinate system for B at θ.
Here is our local result: Theorem 4.7 (Local discrete smoothness). Let θ P M. Suppose the parametrization F : M Ñ R K is of class C r (r ě 0) on some open neighborhood U of θ, and that F pθ 1 q " F pθq for all θ 1 P U . Then, B p is r-differentiable at θ.
Proof. Note that, as an open set, U is an open submanifold of M of same dimension. By Proposition 4.4, we can pick a barcode template pP p , U p q for F pθq. By Proposition 4.5, this barcode template is consistent for all F pθ 1 q where θ 1 P U . Therefore, we can locally write: ( σPUp Y ∆ 8 which is a local coordinate system for B p at θ. This local coordinate system is C r because F itself is C r over U . As a result, B p is r-differentiable at θ, by Proposition 3.8. Corollary 4.8. Let θ P M. Suppose that the parametrization F is of class C r (r ě 0) on some open neighborhood of θ, and that the filter function F pθq is injective. Then, B p is r-differentiable at θ.
Proof. For such a θ, all the quantities F pθqpσq´F pθqpσ 1 q for σ ‰ σ 1 P K are either strictly positive or strictly negative. Therefore, by continuity they keep their sign in an open neighborhood of θ, over which all filter functions are thus ordering equivalent. The result follows then from Theorem 4.7.
Here is our global result: Theorem 4.9 (Global discrete smoothness). Suppose the parametrization F : M Ñ R K is continuous over M and of which is generic (i.e. open and dense) in M. In particular, if F is C r on some generic subset of M in the first place, then so is B p (on some possibly smaller generic subset). Proof.
Observe thatM is open in M. As a consequence, for every θ P U XM there is some open neighborhood on which F is C r and all the filter functions F pθ 1 q are ordering equivalent, which by Theorem 4.7 implies that B p is r-differentiable at θ. Thus, all that remains to be shown is thatM is dense in M, which is the subject of Lemma 4.10 below.
Lemma 4.10. If a parametrization F : M Ñ R K is continuous, then the setM (as defined in Eq. (9)) is dense in M.
Proof. Let h : M Ñ R be a continuous function. Consider the boundary of the zero-level set h´1p0q: Since h is continuous, h´1p0q is closed in M, therefore Bh´1p0q is closed with empty interior, i.e. its complement pBh´1p0qq c in M is open and dense.
Consider now the case of function h σ,σ 1 : θ P M Þ Ñ F pθqpσq´F pθqpσ 1 q P R for some fixed simplices σ ‰ σ 1 of K. The map h σ,σ 1 is continuous by continuity of the parametrization F , therefore the previous paragraph implies that pBh´1 σ,σ 1 p0qq c is generic in M. Hence, the finite intersection is also generic in M. We now show thatM is a subspace ofM.
And if h σ,σ 1 pθq " 0, then, since θ PM, θ lies in the interior of the level set h´1 σ,σ 1 p0q, and therefore there is also an open neighborhood V σ,σ 1 of θ over which h σ,σ 1 " 0. Let V be the finite intersection Ş σ‰σ 1 PK V σ,σ 1 , which is open and non-empty in M. For every σ ‰ σ 1 P K, the sign F pθ 1 qpσq´F pθ 1 qpσ 1 q is constant over all θ 1 P V , where by sign we really distinguish between three possibilities: negative, positive, null. Therefore, the pre-order on the simplices of K induced by F pθ 1 q is constant over the θ 1 P V . In other words, all the F pθ 1 q are ordering equivalent. Therefore, θ PM. Since this is true for any θ PM, we conclude thatM ĎM, and so the latter is also dense in M.
Example 4.11 (Height functions again). Let us reconsider the scenario of Example 4.1. The parametrization F of height filters is C 0 on the entire sphere S d´1 . Moreover, F is smooth at every direction θ P S d´1 that is not orthogonal to some difference v´v 1 of vertices v ‰ v 1 P K 0 in R d . The set U of such directions is generic in S d´1 , therefore B p is 8-differentiable over the generic subset U XS d´1 by Theorem 4.9, withS d´1 defined as in Eq. (9). In fact, we have U XS d´1 " U in this case. Indeed, for any direction θ P U , the values of the height function h θ at the vertices of K are pairwise distinct, and by continuity this remains true in a neighborhood of θ. The pre-order on the simplices of K induced by the height function is then constant over this neighborhood.
In Theorems 4.7 and 4.9, one cannot avoid the condition that filter functions are locally ordering equivalent. Indeed, in the next examples, we highlight that there is generally no hope for the barcode valued map B p to be differentiable everywhere, even if the parametrization F is. This is because, essentially, the time of appearance of a simplex is a maximum of smooth functions, which can be non-smooth at a point where two functions achieve the maximum. The condition that the induced pre-order is locally constant around θ is only a sufficient condition though, because a maximum of two smooth functions can still be smooth at a point where the maximum is attained by the two functions. We provide a second example to illustrate this fact. Example 4.12 (Singular parameter). Let us consider the following geometric simplicial complex K on the real line: That is, K has vertices K 0 " ta, bu with respective coordinates t0, 1u, and edges K 1 " tabu. Consider the parametrization that filters the complex according to the squared euclidean distance to a point, i.e F : θ P R Þ Ñ pσ P K Þ Ñ max xPσ px´θq 2 q. The map B 0 is then essentially a real function that tracks the squared euclidean distance of the vertex closest to θ, specifically: Hence, B 0 is not differentiable at θ " 1 2 since 1 2 is a singular point of the map θ Þ Ñ minpθ 2 , p1´θq 2 q. Meanwhile, for θ ă 1 2 , we have F pθqpaq ă F pθqpbq, whereas whenever θ ą 1 2 , we have F pθqpaq ą F pθqpbq. In particular, the pre-order induced by the filter functions F pθq is not constant around θ " 1 2 , and so 1 2 RR. Example 4.13 (Only sufficient condition). We remove the edge ab from the geometric complex K in the previous example, and we see the points a and b as lying on the x-axis of R 2 . Consider the parametrization of height filters F : θ P S 1 Þ Ñ pσ P K Þ Ñ max xPσ xθ, xyq. The map B p is then trivial for each degree p except 0, where it writes as follows: B 0 pθq " tpxθ, ay,`8q, pxθ, by,`8qu Y ∆ " tp0,`8q, pxθ, p1, 0qy,`8qu Y ∆ 8 .

Differential of the barcode valued map
Given a continuous parametrization F : M Ñ R K of class C 1 on some open set U Ď M, Theorem 4.9 guarantees that a barcode template, through Equation (8), provides a C 1 local coordinate system for B p around each point θ P U XM. In turn, by Proposition 3.8, any arbitrary ordering on the functions of this local coordinate system induces a C 1 local lift of B p . Hence we have the following formula for the corresponding differential: Proposition 4.14. Given θ P U XM and a barcode template pP p , U p q of F pθq, for any choice of ordering pσ 1 , σ 1 1 q,¨¨¨, pσ m , σ 1 m q, τ 1 ,¨¨¨, τ n of pP p , U p q, the map is a local C 1 lift of B p around θ, and the corresponding differential for B p at θ is: Remark 4.15 (Algorithm for computing derivatives). Suppose we are given a parametrization F whose differential we can compute. Let θ P M. If the barcode of F pθq is given to us, then the proof of Proposition 4.4 provides an algorithm to build a barcode template pP p , U p q for F pθq. If the barcode of F pθq is not given in the first place, then the matrix reduction algorithm for computing persistence [ELZ02,ZC05] outputs both the barcode and a barcode template. In both scenarios, Proposition 4.14 gives a formula to compute a differential of B p at θ from the barcode template pP p , U p q.

Directional differentiability of the barcode valued map along strata
In this section we define directional derivatives for the barcode valued map B p : M Ñ Bar at points where it may not be differentiable in the sense of Definition 3.3. For this we stratify the parameter space M in such a way that B p is differentiable on the top-dimensional strata, then we define its derivatives on lower-dimensional strata via directional lifts. Intuitively, the strata in M are prescribed by the ordering equivalence classes in R K , as we know from Theorem 4.7 that the pre-order on simplices plays a key role in the differentiability of B p .
Formally, consider the stratification of R K formed by the collection ΩpR K q of ordering equivalence classes. This is a Whitney stratification, obtained by cutting R K with the hyperplanes tf pσq " f pσ 1 qu for varying simplices σ ‰ σ 1 P K.
We look for stratifications of M that make the parametrization F weakly stratified (in the sense of Definition 2.22) and smooth on each stratum. Here are typical scenarios where such stratifications exist: Proposition 4.16. Let F : M Ñ R K be a continuous parametrization. Suppose that, either (i) M is a semi-algebraic set in R N and F is a semi-algrebraic map, or (ii) M is a compact subanalytic set in a real analytic manifold and F is a subanalytic map.
Then, there is a Whitney stratification of M, made of semi-algebraic (resp. subanalytic) strata, such that F is weakly stratified with C 8 restrictions to each stratum.
Proof. This is Section I.1.7 of [GM88], after observing that the stratification ΩpR K q is made of semi-algebraic strata.
Example 4.17. We consider the parametrization F of height filters on the sphere S d´1 from Example 4.11. By Proposition 4.16, there is a stratification of S d´1 that makes F weakly stratified and C 8 on each stratum. To be more specific, such a stratification is obtained by taking the pre-images 5 of the strata of ΩpR K q via F . Figure 1 illustrates the result in the case d " 3, where the obtained stratification of S 2 is made of an arrangement of great circles, each circle being the pre-image of a set tF pθqpvq " F pθqpv 1 qu for vertices v ‰ v 1 .
Once a stratification S M of M is given, we can introduce a notion of derivative for B p at θ P M in the direction of an incident stratum M 1 , i.e. a stratum whose closure in M contains θ.   Proof. Let θ P M and M 1 a stratum incident to θ. By (i), combined with Propositions 4.4 and 4.5, there exists a barcode template pP p , U p q that is consistent across all F pθ 1 q for θ 1 P M 1 . Therefore, for all θ 1 P M 1 : B p pθ 1 q " pF pθ 1 qpσq, F pθ 1 qpσ 1 qq ( pσ,σ 1 qPPp Y pF pθ 1 qpσq,`8q which by (ii) provides a C r local coordinate system for B p |M 1 . Then by Proposition 3.8, there is a C r lift of B p |M 1 , whose coordinate functions are of the form θ 1 Þ Ñ F pθ 1 qpσq. Using (iii), we extend each coordinate function of this lift (hence the lift itself) to an open neighborhood U of θ in M.
Combining Proposition 4.16 with Theorem 4.19 yields the following: Corollary 4.20. Under the hypotheses of Proposition 4.16, there is a Whitney stratification of M, made of semialgebraic (resp. subanalytic) strata, such that B p is 8-differentiable on the top-dimensional strata (whose union is generic in M). If furthermore F is globally C r , then B p is everywhere r-differentiable along incident strata.
Example 4.21. Consider again the setup of Example 4.12. We stratify R by the point t 1 2 u and the half-lines p´8; 1 2 q and p 1 2 ;`8q. The parametrization F is C 8 and sends strata into strata, therefore by Theorem 4.19 the barcode valued map B 0 admits directional derivatives everywhere on R. More precisely, recall that we have a liftB 0 : θ Þ Ñ minpθ 2 , p1´θq 2 q, which is smooth in the top-dimensional strata, while at θ " 1 2 it admits directional derivatives along the two half-lines, whose values are 1 and´1 respectively and thus do not agree. Example 4.22. Consider again the stratification S S d´1 by the great circles of the parameter space S d´1 associated to the parametrization of height filters (Example 4.17). By Corollary 4.20, we know that there exists a refinement S 1 S d´1 of S S d´1 such that B p admits directional derivatives along incident strata of S 1 S d´1 at every point θ P S d´1 . In fact, we can even take S 1 S d´1 to be S S d´1 itself. Indeed, all the directions in a given stratum M 1 P S S d´1 induce the same pre-order on the simplices of K, therefore • the restriction F |M 1 is valued in a stratum of ΩpR K q, and • for every simplex σ P K, there is a vertexvpσq such that F |M 1 p.qpσq " x.,vpσqy.
Consequently, the assumptions of Theorem 4.19 hold, and the barcode valued map B p admits directional derivatives along incident strata of S S d´1 at every point θ P S d´1 .

The barcode valued map as a permutation map
In this section, we work out a global lift of the barcode valued map, which restricts nicely to each stratum of a stratification of M. To do so, we first focus on the map Dgm which, given a filter function f P R K on a fixed simplicial complex K of dimension d, returns the vector of all its barcodes pDgm p pf qq d p"0 . We observe that Dgm admits a global Euclidean lift, and furthermore, that this lift is essentially a permutation map on each stratum of ΩpR K q. Throughout, we fix an ordering of the simplices of K, so that the canonical basis of R K turns into a basis of R #K , and we let φ : R K Ñ R #K be the corresponding isomorphism.
Proposition 4.23. There exist integers m p , n p for 0 ď p ď d such that ř d p"0 p2m p`np q " #K, and a map Perm : R K Ñ ś d p"0 R 2mpˆRnp -R #K whose restriction Perm |S to each ordering equivalence class S P ΩpR K q is a permutation matrix, and such that the following diagram commutes: 6 For simplicity, from now on we identify f P R K with its image in R #K without explicitly mentioning the map φ.
Proof. Given a filter function f P R K , we define a total barcode template pP, U q for f to be the data of d`1 barcode templates pP p , U p q for f in each homology degree, such that each simplex of K appears exactly once, in a unique P p or U p . We further require that the pairs pσ, σ 1 q appearing in P p consist of a p-dimensional simplex σ and a (p`1)-dimensional simplex σ 1 , while the unpaired simplices appearing in U p must be p-dimensional. A simplex σ is then labelled positive if it appears as the first component of a pair in some P p or U p , and negative otherwise.
Note that total barcode templates always exist, by an argument similar to (yet somewhat more involved than) the one used in the proof of Proposition 4.4. Alternatively, note that applying the matrix reduction algorithm for computing persistence [ELZ02,ZC05] to the sublevel-sets filtration of f produces a total barcode template. By Proposition 4.5, total barcode templates are invariant under ordering equivalences. We therefore fix a unique total barcode template pP pSq, U pSqq per ordering equivalence class S P ΩpR K q (there are only finitely many such classes), and we denote by m p pSq :" #P p pSq, n p pSq :" #U p pSq their sizes in each homology degree p.
Since the barcode templates pP pSq, U pSqq are total, we have ř d p"0 p2m p pSq`n p pSqq " #K. Besides, since the number of infinite intervals in the barcode of a filter function is given by the Betti numbers of the simplicial complex K, an easy induction on the homology degree shows that the number of positive (resp. negative) simplices in each homology degree is independent of the choice of filter function and of total barcode template. Therefore, the integers m p pSq, n p pSq do not depend on the stratum S.
For each stratum S P ΩpR K q and homology degree p, we pick arbitrary orderings pσ k,S , σ 1 k,S q mp k"0 of P p pSq and pτ k,S q np k"0 of U p pSq. Any filter function f P S admits pP pSq, U pSqq as total barcode template, therefore we get that Dgm p pf q " Q mp,np ppf pσ k,S q, f pσ 1 k,S qq mp k"0 , pf pτ k,S qq np k"0 q in every homology degree p. We simply set Permpf q :" rpf pσ k,S q, f pσ 1 k,S qq mp k"0 , pf pτ k,S qq np k"0 s d p"0 P ś d p"0 R 2mpˆRnp , which ensures the commutativity of (11). Since each simplex of K appears exactly once in pP pSq, U pSqq, the vector Permpf q is a re-ordering of the coordinates of f (i.e. of its values on the simplices) and therefore Perm |S is a permutation matrix.
We now turn to the parametrized barcode valued map Proof. The first part of the statement is a straight consequence of Proposition 4.23. Let S M be a stratification satisfying the assumptions of Theorem 4.19. As F is weakly stratified with respect to S M and ΩpR K q, it sends strata into strata and therefore by Proposition 4.23 we haveB " Perm M 1˝F for some permutation matrix Perm M 1 over each stratum M 1 P S M . Then, since F admits local smooth extensions over each stratum M 1 of S M , so do its coordinate functions and in turn so doesB " Perm M 1˝F . These local extensions ofB yield directional derivatives for B along incident strata.
Remark 4.25. Recall that the map Perm is a linear map when restricted to the strata of ΩpR K q, which are simply polyhedra in R K . Therefore, if M is a semi-algebraic set (resp. subanalytic set or definable set in an o-minimal structure) and F is a semi-algebraic (resp. subanalytic or definable) map, then the global liftB " Perm˝F of Corollary 4.24 is itself a semi-algebraic (resp. subanalytic or definable) map. Thus, we recover Proposition 3. We conclude this section with a side result whose proof (deferred to the appendix A) relies on Proposition 4.23. This result states that Dgm is locally an isometry on top-dimensional strata of ΩpR K q. It involves the distance d 0 pf q of any filter function f P R K to the union of strata of ΩpR K q of codimension at least 1: d 0 pf q " 1 2 min σ‰σ 1 |f pσq´f pσ 1 q|. Proposition 4.26. Let f, g P R K be two filter functions that are located in the closure of a common top-dimensional stratum S P ΩpR K q. Then: max 0ďpďd d 8 pDgm p pf q, Dgm p pgqq ě minp}f´g} 8 , maxpd 0 pf q, d 0 pgqqq.
In particular, for any filter function f P R K located in a top-dimensional stratum, the map Dgm is a local isometry in a closed ball of radius d 0 pf q around f , specifically: max 0ďpďd d 8 pDgm p pgq, Dgm p phqq " }g´h} 8 .

Application to common simplicial filtrations
In this section we leverage Theorems 4.7 and 4.9 in the case of a few important classes of parametrizations of filter functions on a simplicial complex K of dimension d. In each case, we derive a characterization of the parameter values where B p is differentiable, and whenever possible we provide an explicit differential of B p using Proposition 4.14. In the following we fix a homology degree 0 ď p ď d.

Lower star filtrations
Parametrizations of lower star filtrations are involved in most practical scenarios [BGGSSG20, CNBW19, GNDS20, HGR`20, PSO18], here we provide a common analysis of their differentiability. Definition 5.1. Given a function f : K 0 Ñ R defined on the vertices of K, we extend it to each simplex σ of K by its highest value on the vertices of σ. The sub-level sets of this function together form the lower-star filtration of K induced by f .
One interest of lower-star filtrations is that any parametrization M Ñ R K0 on the vertex set of K induces a valid parametrization M Ñ R K on K itself. Sufficient conditions for the differentiablity of such parametrizations are easy to work out thanks to the following observation: Proposition 5.2. Let F 0 : M Ñ R K0 be a C r parametrization of filter functions on the vertices of K. Then, the induced parametrization F : M Ñ R K is C r at each θ R SingpF 0 q, where SingpF 0 q is the boundary of the set: tθ P M, Dpv, v 1 q P K 0 , F 0 pθqpvq " F 0 pθqpv 1 qu.
Specifically, for every θ R SingpF 0 q, lettinḡ v : σ P K Þ Ñ argmax v vertex in σ F 0 pθqpvq P K 0 by breaking ties wherever necessary, there is an open neighborhood U of θ such that F pθ 1 qpσq " F 0 pθ 1 qpvpσqq for every θ 1 P U and σ P K, from which follows that F is C r at θ.
Proof. The continuity of F comes from the continuity of F 0 and of the max function. If θ P MzSingpF 0 q, then the pre-order on K 0 induced by F 0 p.q is constant in an open neighborhood U of θ. We want to check that F is C r at θ, i.e. that all maps θ 1 Þ Ñ F pθ 1 qpσq are C r at θ, for a fixed simplex σ P K. For σ a vertex of K, this is true by assumption because F p.qpσq " F 0 p.qpσq. For an arbitrary simplex σ, F p.qpσq " max v vertex in σ F 0 p.qpvq. Since the pre-order induced on K 0 by F 0 is constant over U , the maximum above is attained at vertexvpσq, and this fact holds for all θ 1 in U . Thus, F p.qpσq |U " F 0 p.qpvpσqq |U , which allows us to conclude.
Remark 5.3. Recall that SingpF 0 q is by definition the boundary of tθ P M, Dpv, v 1 q P K 0 , F 0 pθqpvq " F 0 pθqpv 1 qu, whose complement may not be generic (in fact it may even be empty, e.g. when F 0 " 0). This shows the interest of working with locally constant pre-orders on vertices, and not just with locally injective parametrizations as in the works of [BGGSSG20, CNBW19, GNDS20, HGR`20, PSO18].
Defining SingpF 0 q andv as in Proposition 5.2, and combining this result with Proposition 4.14, we deduce the following result on the differentiability of B p , which only relies on the differentiability of F 0 : Corollary 5.4. For any C r parametrization F 0 : M Ñ R K0 on the vertices of K, the induced barcode valued map B p : θ P M Þ Ñ Dgm p pF pθqq P Bar is r-differentiable outside SingpF 0 q. Moreover, at θ P MzSingpF 0 q, for any barcode template pP p , U p q of F pθq and any choice of ordering pσ 1 , σ 1 1 q,¨¨¨, pσ m , σ 1 m q, τ 1 ,¨¨¨, τ n of pP p , U p q, the mapB p : M Ñ R mˆRn defined by: is a local C r lift of B p around θ. The corresponding differential for B p at θ is: Proof. For θ P MzSingpF 0 q, the pre-order on the vertices K 0 induced by F 0 is constant in an open neighborhood U of θ. By Proposition 5.2, each F pθ 1 qpσq rewrites as F 0 pθ 1 qpvpσqq for θ 1 P U , which implies that the pre-order on the simplices of K induced by F is also constant over U . The fact that B p is r-differentiable at θ follows then from Theorem 4.7, since F itself is C r on an open neighborhood of θ (again by Proposition 5.2, and by the fact that SingpF 0 q is closed). The rest of the corollary is an immediate consequence of Proposition 4.14.
Example 5.5. Consider our running example of parametrization of height filtrations F 0 pθq " h θ : v P K 0 Þ Ñ xv, θy P R, where K is a fixed geometric simplicial complex in R d and θ P S d´1 . In this case, we know from Example 4.11 that B p is generically 8-differentiable. Corollary 5.4 provides another proof of this fact: since F 0 is C 8 , B p is 8differentiable outside SingpF 0 q, which has generic complement in S d´1 . Moreover, the components of the differential of B p at θ P S d´1 zSingpF 0 q are the d θ F 0 p¨qpvq, whose corresponding gradients (in the tangent space T θ S d´1 equipped with the Riemannian structure inherited from R d ) are v´xv, θy θ.

Rips filtrations of point clouds
Given a finite point cloud P " pp 1 ,¨¨¨, p n q P R nd , the Rips filtration of P is a filtration of the total complex K :" 2 t1,¨¨¨,nu ztHu with n :" #P vertices, where the time of appearance of a simplex σ Ď t1,¨¨¨, nu is max i,jPσ }p i´pj } 2 . [GHO16] optimize the positions of the points of P in R d so that the barcode of the Rips filtration reaches some target barcode.
Here we see R nd as our parameter space M, and we consider the parametrization The differentiability result of [GHO16] can be expressed as a result on the differentiability of the barcode-valued map B p " Dgm p˝F using our framework. We require that the points of P lie in general position as defined hereafter: Definition 5.6 ([GHO16]). P is in general position if the following two conditions hold: (i) @i ‰ j P t1, ..., nu, p i ‰ p j ; (ii) @ti, ju ‰ tk, lu, where i, j, k, l P t1, ..., nu, }p i´pj } 2 ‰ }p k´pl } 2 .
We denote byP Ď R nd the subspace of point clouds in general position.
Proposition 5.7.P is generic in R nd .
Proof. The set of point clouds P such that p i ‰ p j for all 1 ď i ‰ j ď n is clearly generic in R nd . Moreover, the maps P " pp 1 , ..., p n q Þ Ñ }p i´pj } 2 2´} p k´pl } 2 2 are smooth everywhere and are submersions on a generic subset of R nd , therefore their 0-sets have generic complements, whose (finite) intersection is also generic.
We next observe that the parametrization F is C 8 at point clouds P in general position.
Proposition 5.8. The parametrization F : R nd Ñ R K is C 8 overP. Specifically, given P PP, letting tvpσq,wpσqu " argmax i,jPσ }p i´pj } 2 for every σ P K, there is an open neighborhood U of P such that F pP 1 qpσq " }p 1v pσq´p 1w pσq } 2 for every P 1 " pp 1 1 ,¨¨¨, p 1 n q P U and σ P K, from which follows that F is C 8 at P .
Proof. The continuity of F follows from the continuity of the Euclidean norm and max function. Assuming P is in general position, the distances }p i´pj } 2 , for i ‰ j ranging in t1,¨¨¨, nu, are strictly ordered. By continuity of F , this order remains the same over an open neighborhood U of P in R nd . Therefore, every P 1 " pp 1 1 ,¨¨¨, p 1 n q P U is also in general position, and F pP 1 qpσq " }p 1v pσq´p 1w pσq } 2 for all σ P K. Now, the map P 1 Þ Ñ }p 1v pσq´p 1w pσq } 2 is C 8 at P for each σ because p 1v pσq ‰ p 1w pσq . This implies that F is C 8 at P .
Definingv,w as in Proposition 5.8, and combining this result with Proposition 4.14, we deduce the following differential of B p , which only relies on derivatives of the Euclidean distance between points: Corollary 5.9. The barcode valued map B p : P P R nd Þ Ñ Dgm p pF pP qq P Bar is 8-differentiable inP. Moreover, at P PP, for any barcode template pP p , U p q o F pP q and any choice of ordering pσ 1 , σ 1 1 q,¨¨¨, pσ m , σ 1 m q, τ 1 ,¨¨¨, τ n of pP p , U p q, the mapB p defined on a point cloud P 1 " pp 1 1 , ..., p 1 n q by: is a local C 8 lift of B p around P . The corresponding differential d P,Bp B p : R nd Ñ R 2mˆRn is defined on a tangent vector u P R nd by: "`x Pv pσiq,wpσiq , uy, xPv pσ 1 i q,wpσ 1 i q , uy˘m i"1 ,`xPv pτj q,wpτj q , uy˘n j"1 ı , where P i,j denotes the vector with pi´pj }pi´pj }2 as i-th component (resp. pj´pi }pi´pj }2 as j-th component) and 0 as other components.
This result implies in particular that B p is generically 8-differentiable, since by Proposition 5.7 the set of point clouds in general position is generic in R nd .
Proof. By Proposition 5.8, F is C 8 inP, which is open by Proposition 5.7. Given P in general position, the distances }p i´pj } 2 , for i ‰ j ranging in t1,¨¨¨, nu, are strictly ordered, and this order remains the same over an open neighborhood U of P in R nd by continuity. By Proposition 5.8 again, we have F pP 1 qpσq " }p 1v pσq´p 1w pσq } 2 for every P 1 " pp 1 1 , ..., p 1 n q P U and σ P K. Therefore, the pre-order induced by F on the simplices of K is constant over U . Consequently, B p is 8-differentiable at P by Theorem 4.7. The rest of the statement is an immediate consequence of Proposition 4.14.
We conclude this section by considering a parametrization that constraints the points p 1 , ..., p n to evolve along smooth submanifolds M 1 , ..., M n of R d : Note that, in this case, the point clouds ιpθ 1 q are not necessarily in general position, but the way they violate conditions (i) and (ii) of Definition 5.6 is constant. Let U 1 (resp. U ) denote the set of points in M where (i) (resp. (ii)) is satisfied. From the above, F˝ι is C 8 over U X U 1 . We now show that U X U 1 is generic in M.
Calling U ijkl the quadric tP P R nd | }p i´pj } 2 " }p k´pl } 2 u, and U 1 ij the hyperplane tP P R nd | p i " p j u, for i, j, k, l ranging in t1,¨¨¨, nu, we have: Indeed, for any ti, ju ‰ tk, lu, the order between }p i´pj } 2 and }p k´pl } 2 in ιpθq is strict when θ is in the (open) complement of ι´1pU ijkl q, constantly an equality when θ is inside the (open) interior ι´1pU ijkl q˝, and not locally constant when θ lies on the boundary Bι´1pU ijkl q. Hence the formula for U . The formula for U 1 follows from the same argument.
The sets Bι´1pU ijkl q and Bι´1pU 1 ij q are boundaries of closed sets, and thus their complements in M are generic. As finite intersections of generic sets, U and U 1 are themselves generic. Theorem 4.9 allows us to conclude.

Rips filtrations of clouds of ellipsoids
As pointed out by [BKSW18], in some cases, growing isotropic balls around the points of P " pp 1 , ..., p n q P R nd may result in a loss of geometric information. It is then advised to grow rather ellipsoids with distinct covariance matrices around each point, to account for the local anistropy of the problem. Formally, the Ellipsoid-Rips filtration of P with respect to the vector of covariance matrices A " pA 1 , ..., A n q P pS d,`p Rqq n is a filtration of the total complex K :" 2 t1,...,nu ztHu with n :" #P vertices, in which the time of appearance of a simplex σ Ď t1, ..., nu is given by: where the q i : x P R d Þ Ñ xA i x, xy are the quadrics determined by the positive definite matrices A i . 7 Here we see the space pS d,`p Rqq n as our parameter space M, whose smooth structure is inherited from that of´R dpd`1q 2¯n , and we consider the parametrization: F pAqpσq :" max i,jPσ r i,j pAq.
We are then interested in the differentiability of the barcode valued map B p " Dgm p˝F . Inspired by the case of isotropic Rips filtrations, we require that the covariance matrices in A lie in general position as defined hereafter: Definition 5.11. The pair pA, P q is in general position if the two following conditions hold: • all points in P are distinct, i.e, p i ‰ p j whenever 1 ď i ‰ j ď n ; • all pairwise "ellipsoidal" distances are distinct, i.e, r i,j pAq ‰ r k,l pAq whenever ti, ju ‰ tk, lu Ď t1, ..., nu. Proposition 5.12. Assume the points of P to be pairwise distinct. Then, the set of vectors of covariance matrices A such that pA, P q is in general position is generic in S d,`p R d q n .
Proof. First, we claim that the sets O ijkl :" tA P S d,`p R d q n | r i,j pAq " r k,l pAqu, for ti, ju ‰ tk, lu, are level-sets of some smooth real valued functions on S d,`p R d q n whose gradients are nowhere zero. To prove this fact, we introduce the quantities C :" }pi´pj }2 }p k´pl }2 and px, yq :" p pi´pj }pi´pj }2 , p k´pl }p k´pl }2 q. Then: Note that x, y are non zero because points in P are distinct. Therefore, the map f ijkl :" A P S d,`p Rq d Þ Ñ ? ăAix,xą`?ăAj x,xą ?
ăA k y,yą`?ăA l y,yą P R is well-defined and smooth on S d,`p Rq n as the two inner products in the denominator are always strictly positive. We want to compute ∇f ijkl " p∇ A1 f ijkl , ..., ∇ An f ijkl q where ∇ At f ijkl is the gradient of f ijkl with respect to the t-th component of A. For t " i: The first two factors are strictly positive scalars for any A P S d,`p Rq d . The last factor is the gradient of a non-zero linear map, so it is non-zero. As a consequence, the gradient ∇ A f ijkl is nowhere zero, which proves our claim.
Then, by the constant rank theorem, each O ijkl is a smooth sub-manifold of S d,`p R d q n of dimension strictly lower than that of S d,`p R d q n . Taking their (finite) union allows us to conclude.
From this point, the same chain of arguments as in the isotropic case allows us to show that the parametrization F is C 8 at vectors of covariance matrices A in general position, and to express the differential of B p at A. Assume the points of P to be pairwise distinct, and denote byÃ Ď S d,`p R d q n the subspace of covariance matrices A such that pA, P q is in general position. Proposition 5.13. The parametrization F : S d,`p R d q n Ñ R K is C 8 overÃ. Specifically, given A PÃ, letting tvpσq,wpσqu " argmax i,jPσ r i,j pAq for every σ P K, there is an open neighborhood U of A such that F pA 1 qpσq " rv pσq,wpσq pA 1 q for every A 1 " pA 1 1 , ..., A 1 n q P U and σ P K, from which follows that F is C 8 at A.
Proof. Let A PÃ. Then, the maps r i,j are C 8 because the points of P are pairwise distinct, and furthermore the quantities r i,j pAq, for i ‰ j ranging in t1,¨¨¨, nu, are strictly ordered. By continuity, this order remains the same over an open neighborhood U of A in S d,`p R d q n . Therefore, for every A 1 P U , for all σ P K, we have F pA 1 qpσq " rv pσq,wpσq pA 1 q. This implies that F is C 8 at A.
Definingv,w as in Proposition 5.13, and combining this result with Proposition 4.14, we deduce the following formula for the differential of B p , which only rely on derivatives of the maps r i,j : Corollary 5.14. The barcode valued map B p : A P S d,`p R d q n Þ Ñ Dgm p pF pAqq P Bar is 8-differentiable overÃ. Moreover, at A PÃ, for any barcode template pP p , U p q of F pAq and any choice of ordering pσ 1 , σ 1 1 q,¨¨¨, pσ m , σ 1 m q, τ 1 ,¨¨¨, τ n of pP p , U p q, the mapB p defined by: A 1 " pA 1 1 , ..., A 1 n q Þ ÝÑ "`rv pσiq,wpσiq pA 1 q, rv pσ 1 i q,wpσ 1 i q pA 1 q˘m i"1 ,`rv pτ j q,wpτ j q pA 1 q˘n j"1 ı is a local C 8 lift of B p around P , whose differential provides a closed formula for d A,Bp B p .
This result implies in particular that B p is generically 8-differentiable, since by Proposition 5.12 the set of vectors of covariance matrices in general position is generic in S d,`p R d q n (provided the points of P are pairwise distinct).
Proof. By Proposition 5.13, F is C 8 inÃ, which is open by Proposition 5.12. Given A PÃ, the quantities r i,j pAq, for i ‰ j ranging in t1,¨¨¨, nu, are strictly ordered, and this order remains the same over an open neighborhood U of A in S d,`p R d q n by continuity. By Proposition 5.13 again, we have F pA 1 qpσq " rv pσq,wpσq pA 1 q for every A 1 " pA 1 1 , ..., A 1 n q P U and σ P K. Therefore, the pre-order induced by F on the simplices of K is constant over U . Consequently, B p is 8-differentiable at A by Theorem 4.7. The rest of the statement is an immediate consequence of Proposition 4.14.
Remark 5.15. Corollaries 5.9 and 5.14 can be combined together to generically differentiate the barcode valued map B p with respect to both the point positions and the covariance matrices. The corresponding parameter space is R ndˆS d,`p R d q n .

Arbitrary filtrations of a simplicial complex
In certain scenarios, the optimization takes place in the entire space of filter functions FiltpKq on a fixed simplicial complex K. For instance, in the context of topological simplification of a filter function f 0 , as described by [AGH`09, ELZ02], one looks for a filter function f P R K which is ε-close to f 0 in supremum norm and whose diagram Dgm p pf q equals Dgm p pf 0 qz∆ , where ∆ is the set of intervals of Dgm p pf 0 q that are ε-close to the diagonal. One way to formalize this question is as a soft-constrained optimization problem, whereby the bottleneck distance to the simplified barcode is to be minimized in tandem with the supremum-norm distance to the original function: for some fixed mixing parameter λ. This optimization problem can be tackled using a variational approach, for which it is more convenient to work in the manifold R K containing FiltpKq. However, in order to avoid leaving FiltpKq, we consider the parametrization of R K given by the indicator function of FiltpKq: which is smooth generically. The optimisation becomes then: Implementing a variational approach such as gradient descent requires both terms in (15) to be differentiable. The second term is generically differentiable, as the parametrization F and the norm }¨} 8 are. The first term is the composition which by the chain rule (Proposition 3.14) is differentiable as long as both arrows are. Since F is generically differentiable, so is the first arrow by Theorem 4.9. The second arrow is the bottleneck distance to a fixed diagram and therefore also generically differentiable, as will be argued in Section 7. There, we also view Eq. (15) as an instance of semi-algebraic loss function, which can be minimised via Stochastic Gradient Descent (SGD).

The case of barcode valued maps derived from real functions on a manifold
In this section we consider barcode valued maps that factor through the space R X of real functions on a fixed smooth compact d-manifold X without boundary. Since we seek statements about the differentiability of B, we restrict the focus to maps that factor through C 8 pX , Rq equipped with the standard Whitney C 8 topology: 8 Here, Dgm is the map that takes a function f P C 8 pX , Rq to the vector of its barcodes pDgm p pf qq d p"0 . It is well-defined on C 8 pX , Rq, as continuous functions on triangulable spaces have well-defined persistence diagrams [CdSGO16]. However, as in the previous sections, we want to work only with barcodes that have finitely many off-diagonal points, therefore we further assume that F takes its values in the subset TamepX q of tame C 8 functions-note that TamepX q contains the generic subset of Morse functions on X [Mil63]. Hence the factorization: As before, we call F the parametrization associated to B, and M the parameter space, whose elements are generally refered to as θ. We also denote F pθq by f θ to emphasize the fact that F is valued in a function space. The map Dgm takes f θ to the vector of its barcodes pDgm p pf θ qq d p"0 , so we can take advantage of the bijective correspondence between the critical points of f θ (provided f θ is Morse) and the interval endpoints in this vector (Proposition 2.14).
As in the case of a parametrization valued in the space of filter functions on a simplicial complex, we need F to be smooth in some reasonable sense to ensure that the composite B is 8-differentiable. For this, we define a curve c : R Ñ C 8 pX , Rq to be differentiable if the limit lim hÑ0 cpt`hq´cptq h exists for all t P R. The limit can be viewed as a curve, and when iterated limits exist, we say that c is a smooth curve. We then say that the parametrization F is smooth 9 if it sends every smooth curve θptq in M to a smooth curve F pθptqq in C 8 pX , Rq. By Corollary 11.9 in [Mic80], if F is smooth, then its uncurrified versioñ is a smooth map in the usual sense, to which we can therefore apply standard results from differential calculus, typically the implicit function theorem. This will be instrumental in the proof of our main result (Theorem 6.1).

Smoothness of the barcode valued map
Theorem 6.1 (Continuous smoothness). Let F : M Ñ C 8 pX , Rq be a parametrization of class C 8 c valued in TamepX q. Let θ P M be a parameter such that f θ is Morse with critical values of multiplicity 1. Then, B is 8-differentiable at θ.
Proof. Since f θ is a Morse function on a compact manifold, Critpf θ q is a finite set whose cardinality we denote by N θ . We will proceed by proving the following statements in sequence: (i) There exist an open neighborhood U of θ and smooth maps π l : U Ñ X for 1 ď l ď N θ that track the critical points, that is: @θ 1 P U, Critpf θ 1 q " tπ l pθ 1 qu 1ďlďN θ (ii) Shrinking U if necessary, we further have that for any θ 1 P U , f θ 1 is Morse with critical values of multiplicity 1.
(v) There exist smooth local coordinate systems for B p at θ for every 0 ď p ď d. Therefore, by Proposition 3.8, the barcode valued map B is 8-differentiable at θ.
The proofs of assertions (i) and (ii) use differential geometry: we show that we can smoothly track the critical points of f θ 1 as θ 1 varies in a neighborhood of θ. The proof of assertion (iii) simply exploits the fact that the endpoints in the barcodes of a Morse function are its critical values (Propostion 2.14). Assertion (iv) means that the critical points do not exchange their contributions to the persistence diagrams when the parameter is varying. This will be shown using standard tools in persistence theory. Assertion (v) is obtained by re-indexing the set t1, ..., N θ u such that, through this re-indexation, the maps θ 1 Þ Ñ f θ 1 pπ l pθ 1 qq provide local coordinate systems as defined in Definition 3.6.

Proof of assertion (i):
The tangent bundle T X " Ů xPX txuˆT x X is a smooth manifold of dimension 2d. Let x 1 , ..., x N θ be the critical points of f θ . Locally, in an open neighborhood V of these critical points, the tangent bundle is parallelizable, i.e. we have a diffeomorphism T V -VˆR d and the projection onto the second component provides a smooth map to R d . Consider the map: which is smooth due to the smoothness ofF , see Eq. (17). Then, at the critical points we have BF pθ, x l q " ∇f θ px l q " 0.
Moreover, because f θ is Morse, ∇ x BF pθ, x l q " ∇ 2 f θ px l q is invertible, where ∇ x BF denotes the first derivative of BF with respect to its second argument. We can then apply the implicit function theorem to BF : there exist an open neighborhood U l of θ, an open neighborhood V l of x l (contained in V) and a smooth diffeomorphism π l : U l Ñ V l such that @pθ 1 , xq P U lˆVl , BF pθ 1 , xq " 0 ðñ x " π l pθ 1 q.
Let U " Ş N θ l"1 U l . After shrinking each V l so that it equals π l pU q, we obtain that (19) holds over UˆV l for every 1 ď l ď N θ . Now, by definition of BF and the pðq of (19), we have @θ 1 P U, tπ l pθ 1 qu 1ďlďN θ Ď Critpf θ 1 q.
We now show the converse inclusion. From the pñq in Equation (19), it is sufficient to prove that no critical points of f θ 1 can be found in the compact set W :" X zp Ť N θ l"1 V l q when θ 1 ranges over U . We equip X with an arbitrary Riemannian metric g, and we consider the smooth map: where ∇f θ 1 pxq P T x X . In particular, BGpθ 1 , xq is zero if and only if x is a critical point of f θ 1 . As a result, BG does not vanish on tθuˆW since W includes no critical point of f θ 1 . By the compactness of W and the continuity of BG, there exists an open neighborhood U 1 of θ such that BG |U 1ˆW does not vanish either. Assertion (i) follows after shrinking U to U X U 1 .
Proof of assertion (ii): Let U be as in assertion (i). Since f θ is Morse, ∇ x BF pθ, x l q " ∇ 2 f θ px l q is invertible for each l P t1, ..., N θ u. BF is of class C 1 as it is of class C 8 , so we get open neighborhoods U 1 l of θ and V 1 l of x l such that ∇ x BF is invertible over U 1 lˆV 1 l . We shrink U to U X p Ş N θ l"1 U 1 l q and each V l to V l X V 1 l , so that the critical points of f θ 1 are non-degenerate for θ 1 P U . Shrinking U further if necessary, a similar argument ensures that the critical values of f θ 1 have multiplicity 1 for all θ 1 P U . This concludes the proof of assertion (ii).
Proof of assertion (v): For any homology degree 0 ď p ď d, by assertion (iii), each bounded off-diagonal interval pb, dq in Dgm p pf θ qz∆ can be rewritten as pf θ pπ l b,p pθqq, f θ pπ l d,p pθqqq for some indices l b,p ‰ l d,p . Similarly, each interval pv,`8q can be rewritten as pf θ pπ lv,p pθqq,`8q for some index l v,p . By assertion (iv), for any parameter θ 1 P U , B p pθ 1 q equals This provides a smooth local coordinate system (see Definition 3.7) for B p at θ, therefore B p is 8-differentiable at θ by Proposition 3.8. Since this is true for every 0 ď p ď d, B itself is 8-differentiable at θ.
Remark 6.2 (Multiplicity one). The upcoming Figure 5 shows how important the assumption that f θ has critical values of multiplicity 1 is for the conclusion of Theorem 6.1 to hold. Roughly speaking, the assumption implies that the critical points do not exchange their contributions to the persistence diagrams of f θ under perturbations of θ. We proved this fact using the Stability Theorem for persistence diagrams (see the proof of assertion (iv) above), however it is also a consequence of the so-called structural stability theorem for dynamical systems [PS70]. This result implies that the gradient vector field induced by a Morse function f θ with distinct critical values is structurally stable, and as an immediate consequence, that the Morse-Smale complex of f θ does not change as we smoothly perturb f θ . The Morse-Smale complex allows us to recover the persistence module completely and, in turn, the barcode of f θ .

Discussion: generic differentiability
Theorem 6.1 guarantees that B is 8-differentiable at parameters θ that produce Morse functions with critical values of multiplicity 1. The set of such functions is a generic subspace of C 8 pX , Rq [GG73]. We can also argue that, under some extra conditions on the parametrization F , the set DpM, X q of parameters θ P M that produce Morse functions f θ with critical values of multiplicity 1 is generic in M: Nic11]). If F is smooth and generically large, i.e. for generic x P X the map θ P M Þ Ñ df θ pxq P T x X˚is a submersion, then DpM, X q is generic in M.
There are important examples where this result applies, such as for instance: Example 6.4 ( [Nic11]). Assume X is embedded in R d and translated so as not to contain the origin. Then, each of the following parametrizations F is smooth and generically large: v P R d Þ Ñ px P X Þ Ñ xv, xy P Rq p P R d Þ Ñ px P X Þ Ñ |x´p| 2 P Rq A P S`pR d q Þ Ñ px P X Þ Ñ 1 2 xAx, xy P Rq

A simple example
Take the ground space X to be the torus S 1ˆS1 embedded in R 3 , the parameter space M to be the 2-sphere S 2 , and the parametrization F to be the family of height filtrations, i.e F : θ P S 2 Þ Ñ px P X Þ Ñ xθ, xy P Rq. For a generic direction θ P S 2 , the induced height function, which we denote by h θ , will be Morse and no two critical points are in the same level set. In this case we can track the critical points smoothly as we vary θ, and the barcodes Dgm p ph θ q also evolve smoothly. An example of this situation is given in Figure 2. The implicit function theorem applied to these critical points allows us to track them smoothly when perturbing the height function (purple arrow). The correspondence to points in the barcode remains unchanged.
Even in this elementary situation, the singular parameters θ P S 2 can exhibit pathological behaviors. There are two specific heights, on opposite sides of the sphere S 2 , that produce Morse-Bott functions. We show one of them in Figure 3. At such a parameter θ, the critical sets are codimension-1 submanifolds of X , and smooth perturbations of θ may result in discontinuous changes in the critical set.
There are other directions θ at which the assumptions of Theorem 6.1 are not met, yet the interval endpoints in the barcode can still be tracked smoothly. Such a case is shown in Figure 4, where the height function h θ is Morse but with a critical value of multiplicity 2. In this specific case, the implicit function theorem still applies to both critical points and provides a smooth local coordinate system for the barcode of h θ . However, in the general case, such a Morse function with two critical points sitting in the same level-set can induce a change in the correspondence with interval endpoints in the barcode, potentially resulting in non-smooth behavior of the barcode valued map B. An example is given in Figure 5. functions are part of the wide class of linear representations, which we study in Section 7.2. In Section 7.4, we study an important example of non-linear loss, namely the bottleneck distance to a fixed barcode, which we believe can be of interest in the context of inverse problems. The machinery developed in this section is likely to be adaptable to other examples of maps on barcodes, however the purpose of the section is to provide a proof of concept rather than an exhaustive treatment.

The differentiability of persistence images
Recall that Bar is equipped with the bottleneck topology. Let Bar n be the subset of Bar containing the barcodes with n infinite intervals. In particular, Bar 0 is the set of barcodes whose intervals are bounded.
Proposition 7.1. The set of path connected components of Bar is enumerable. More precisely, π 0 pBarq " tBar n u`8 n"0 .
Proof. Since Bar " Ů`8 n"0 Bar n , we only need to prove that each Bar n is a maximal connected subset of Bar. First note that Bar n is path connected, as we can always move n infinite intervals to n other ones continuously, and similarly move the bounded off-diagonal intervals to the diagonal. We now prove the maximality of Bar n . Let A Ď BarzBar n be non-empty. Any element in A has infinite bottleneck distance to any element in Bar n , since their numbers of infinite intervals are different. Therefore, A Y Bar n cannot be path-connected, and so Bar n is maximal.
We view the persistence image as a map V : Bar 0 Ñ R n 2 for some discretization step n P N: Definition 7.2. Let D P Bar 0 . We fix a weighting function ω : R Ñ R that is zero at the origin. For pb, dq P R 2 , consider the Gaussian g b,d : px, yq P R 2 Þ Ñ 1 2πσ 2 e´r px´bq 2`p y´pd´bqq 2 s{2σ 2 for some fixed variance σ ą 0. The persistence surface associated to D is the map Given a square B Ă R 2 , we subdivide it into n 2 regular squares B k,l for 1 ď k, l ď n. Then we define the persistence image of D to be the histogram V B,n : D P Bar 0 Þ Ñ˜ż px,yqPB k,l ρ D px, yqdxdy¸1 ďk,lďn P R n 2 Proposition 7.3. If ω is C r over R 2 for some integer r P N, then V B,n is r-differentiable everywhere in Bar 0 .
Proof. The maps pb, dq P R 2 Þ Ñ ş px,yqPB k,l g b,d px, yqdxdy P R are C 8 for any fixed box B k,l . For any space of ordered barcodes R 2mˆR0 and anyD " pb 1 , d 1 , ..., b m , d m q P R 2mˆR0 , which is C r at everyD P R 2mˆR0 .
In [AEK`17], the weighting function ω is chosen to be the ramp function ω t : R Þ Ñ R defined as for some parameter t ą 0. Thus, the ramp function is differentiable everywhere except at 0 and t. This implies that the persistence image V B,n is nowhere differentiable, as every neighborhood of a barcode always contains some neighborhood of the diagonal ∆. Thanks to Proposition 7.3, this issue can be resolved by taking any C r approximation of the ramp function, which makes the persistence image r-differentiable over Bar 0 . Figure 5: A 2-torus filtered by two infinitesimal perturbations of the vertical height function, together with their critical points and labels to indicate the dimension of the homology group that they affect. Paired critical points correspond to bounded intervals in the associated barcodes. Here, the vertical height function is Morse and the critical points evolve smoothly. However, the pairing between critical points is not constant, nor their homological dimensions. Therefore, the barcode valued map is not smooth at the vertical direction.

Linear representations of barcodes
The analysis of persistence images in the previous section can be generalized to the following wide class of vectorizations: Definition 7.4. Let φ : R 2 Ñ R k , ψ : R Ñ R k and ω : R Ñ R be continuous maps such that ωp0q " 0. The associated linear representation is the map When k " 1, a linear representation may be viewed as a loss function on persistence diagrams. The total persistence in Example 3.11, and more generally the q-Wasserstein distance to the empty diagram, are such loss functions. In addition, the structure elements of [HKN19, Definition 9] form a wide class of parametrized linear losses and linear representations that can be optimised.
In all these examples, the maps φ, ψ and ω are not necessarily smooth by design, see e.g. the ramp function in Eq. (23) for persistence images, but one can always replace them with smooth approximations. We then get r-differentiable maps on barcodes, as expressed in the following result. Proposition 7.5. If the maps φ, ψ are C r on generic subsets of R 2 containing the diagonal ∆, and if ω is C r on a generic subset of R containing the origin, then the associated linear representation V is generically r-differentiable. Whenever φ, ψ and ω are in fact C r everywhere, then V is r-differentiable everywhere.
Proof. The subspace of barcodes whose intervals avoid the set of non-differentiability of φ, ψ and ω is clearly generic in Bar. Let D be a barcode therein. For any space of ordered barcodes R 2mˆRn and pre-imageD " rpb i , d i q m i"1 , pv j q n j"1 s P R 2mˆRn of D, we have V pQ m,n pDqq " which is C r in a neighborhood ofD.
Let us consider an everywhere r-differentiable linear representation V , and a barcode valued map B on a simplicial complex, which is (generically) differentiable (Theorem 4.9). Using the chain rule 3.14, the composition V˝B is then itself (generically) differentiable, hence amenable to gradient descent based optimisation.

Semi-algebraic and subanalytic functions on barcodes
We consider another important class of examples arising from loss functions on barcodes that restrict to semi-algebraic maps on the spaces of ordered barcodes. The subanalytic and definable counterparts are analogously defined and the results of this section are valid in these situations as well. See also [CCG`21] for a full treatment of semi-algebraic loss functions in persistence. Definition 7.6. We say that a map V : Bar Ñ R is semi-algebraic if all the precompositions V˝Q m,n : R 2mˆRn Ñ R are semi-algebraic.
A prototypical example of semi-algebraic loss on barcodes is the distance to a target barcode D 0 : Here, d q is the q-th Wasserstein distance on barcodes for any q P R˚as defined in Eq. (7), and d 8 is the bottleneck distance. Proposition 7.7. For any target barcode D 0 and non-negative number q P R˚, the map d q pD 0 , .q : Bar Ñ R is semi-algebraic.
Proof. We consider the case where q " 8, as the same line of arguments works for arbitrary Wasserstein metrics, and rewrite d q pD 0 , .q as d D0 for simplicity. Let m, n P N. We assume that n is the number of infinite intervals in D 0 , as otherwise the map d D0˝Qm,n : R 2mˆRn Ñ R takes infinite value everywhere. Then, d D0˝Qm,n can be expressed as a minimum of finitely many cost functions, min cpγ m,n qp.q, each of which is defined in terms of a fixed partial matching γ m,n of coordinates in R 2mˆRn with interval endpoints of D 0 . As a point-wise maximum of finitely many absolute values, each cost function cpγ m,n qp.q is semi-algebraic, and so d D0˝Qm,n is semi-algebraic.
Semi-algebraic functions V on barcodes are particularly useful in the context of optimisation when pre-composed with a semi-algebraic parametrization of filter functions F : M Ñ R K on a fixed simplicial complex K. Indeed, composition preserves semi-algebraicity, and so from Remark 4.25 the loss function given by the composition is a semi-algebraic map. Then, [DDKL20, Corollary 5.9] guarantees that the well-known stochastic gradient descent (SGD) algorithm converges almost surely to critical points of L. 11 This guarantee can be applied to various optimisation problems. When choosing the Rips parametrization F of point clouds as in Section 5.2, minimizing the loss L " d q pD 0 , .q˝Dgm p˝F amounts to solving the problem of point cloud inference originally proposed in [GHO16], see [GNDS20] for implementations. Besides, from Section 5.4, for F the parametrization of all filter functions on a fixed simplicial complex and an adequate target barcode D 0 , the minimisation of L yields an approach to function simplification. However, when F is not semi-algebraic, typically in the continuous setting developped in Section 6, and more generally for an arbitrary barcode valued map B : M Ñ Bar, it is unclear how to perform full-fledged continuous gradient descent to minimize While implementing a solution to this problem is beyond the scope of this paper, it serves as a motivation for the next section where we show that the bottleneck distance to D 0 is generically 8-differentiable, as then the chain rule of Proposition 3.14 enables the use of gradient descent.

The bottleneck distance to a diagram
For simplicity, we denote the bottleneck distance to a fixed barcode D 0 by: For ease of exposition, we consider the special case where D 0 " ∆ 8 is the empty diagram (the diagonal ∆ with infinite multiplicity). The analysis of the general case of an arbitrary fixed barcode D 0 is technically more involved and is deferred to Appendix B.
Recall that d ∆ 8 pDq "`8 for any diagram D P Bar with infinite bars. Consequently, we consider the restriction of d ∆ 8 to the subset Bar 0 introduced in Section 7.1. This restriction is valued in the real line: d ∆ 8 : Bar 0 Ñ R. Consider the set Bar ∆ of barcodes which admit a unique point at maximal distance to the diagonal ∆: For D P Bar ∆ , we let pb D ,d D q P D be the unique interval in the set argmax pb,dqPD |d´b| 2 . Proposition 7.8. Bar ∆ is generic in Bar 0 . Moreover, given D P Bar ∆ , for ε ą 0 small enough, any D 1 at bottleneck distance less than ε from D satisfies d ∆ 8 pD 1 q " |d D 1´b D 1 | 2 and }pb D 1 ,d D 1 q´pb D ,d D q} 8 ă ε.
Proof. Given D P Bar 0 , consider the set argmax pb,dqPD |d´b| 2 . If this set is not a singleton, then we can move infinitesimally one of its elements away from the diagonal, so as to get a diagram in Bar ∆ . Thus, Bar ∆ is dense in Bar 0 . Let now D P Bar ∆ , and let δ be the second maximal distance to the diagonal: and α :" |d D´bD | 2´δ ą 0. Take ε P`0, α 4˘. If D 1 is at bottleneck distance less than ε from D, all the points of D 1 are within distance less than ε either from the diagonal or from an off-diagonal point of D. As we have picked ε ă α 4 , there is a unique off-diagonal point pb 1 ,d 1 q of D 1 that is within distance less than ε from pb D ,d D q, and it must be the unique furthest point from ∆ in D 1 . So indeed D 1 P Bar ∆ and pb D 1 ,d D 1 q " pb 1 ,d 1 q. Therefore, Bar ∆ is open, which concludes the proof.
Not surprisingly, d ∆ 8 is smooth at every D P Bar ∆ , with partial derivatives related to the ones of the map pb D ,d D q Þ Ñ |d D´bD | 2 . Proposition 7.9. For any D P Bar ∆ , (i) d ∆ 8 is 8-differentiable at D, and (ii) for any m P N andD P R 2mˆR0 such that Q m,0 pDq " D, there are exactly two non-zero components in the gradient ∇Dpd ∆ 8˝Q m,0 q, one with value 1 2 and the other with value´1 2 .
We now address the second part of the proposition. Let f P R K be a filter function. By the stability Theorem 2.12, showing that Eq. (13) holds amounts to showing that max 0ďpďd d 8 pDgm p pf q, Dgm p pgqq ě }f´g} 8 . We denote by S the top dimensional stratum S that contains f , and let g P R K be another filter such that }f´g} 8 ď d 0 pf q. This implies that g is also in the (closure of the) stratum S. We can then apply (12), and since by assumption }f´g} 8 ď d 0 pf q ď maxpd 0 pf q, d 0 pgqq, we have the desired result.
Using similar arguments, we finally prove Eq. (14). Let f P R K be a filter function in some top dimensional stratum S, and g, h P R K be such that }f´g} ď d0pf q 3 and }f´h} ď d0pf q 3 . By the stability Theorem 2.12, showing that Eq. (14) holds amounts to showing that max 0ďpďd d 8 pDgm p pgq, Dgm p phqq ě }g´h} 8 . For every i ‰ j P t1, ..., #Ku, Therefore, }g´h} 8 ď maxpd 0 pgq, d 0 phqq, and since both g, h are in (the closure of) S, we conclude by using Eq. (12).
B The bottleneck distance to a fixed diagram: the general case Throughout, we denote by ∆ the set of elements in R 2 that are at distance less than ą 0 to the diagonal ∆. We equip, for the purpose of this section only, the spaces of ordered barcodes with the supremum norm }.} 8 rather than the Euclidean norm. Note that (the proof of) Proposition 3.2 ensures that the quotient maps Q m,n are 1-Lipchitz with respect to the metrics in place. We denote by Bp.,˚q the ball centered at . with radius˚with respect to the supremum norm or bottleneck metric depending on the context.
In this section we generalize Proposition 7.9, namely we show the generic differentiability of the bottleneck distance d D0 : Bar Ñ R Y t`8u to an arbitrary fixed diagram D 0 P Bar. Proposition B.1. Let D 0 P Bar and n be the number of infinite bars in D 0 . For generic D P Bar n , d D0 is 8differentiable at D. Moreover, for any m P N andD P R 2mˆRn such that Q m,n pDq " D, exactly one of the following possibilities holds: (i) either the gradient ∇Dpd D0˝Qm,n q has exactly two non-zero components, one with value 1 2 and the other with value´1 2 ; or (ii) the gradient ∇Dpd D0˝Qm,n q has a unique non-zero component with value 1 or´1.
Proposition B.1 states the generic smoothness of d D0 . We first observe that all the compositions d D0˝Qm,n are smooth on a generic subset of R 2m`n . Lemma B.2. For every m P N, the map d D0˝Qm,n : R 2m`n Ñ R is generically smooth, with gradients that are either 0 or as in piq or piiq of Proposition B.1.
Proof. Let m P N. Define an ordered matchingγ : R 2m`n Ñ R 2m`n to be an affine map whose first m pairs of coordinate functions (resp. last n coordinate functions) are of the formD :" rpb i , d i q m i"1 , pv j q n j"1 s Þ Ñ pb i , d i q1 pb 0,i , d 0,i q where pb 0,i , d 0,i q is either an off-diagonal point in D 0 or pb 0,i , d 0,i q " p bi`di 2 , bi`di 2 q is the orthogonal projection of pb i , d i q onto ∆ (resp. are of the formD Þ Ñ v j´v0,j for some infinite interval pv 0,j ,`8q in D 0 ). We further require that the collection of intervals pb 0,i , d 0,i q (resp. pv 0,j ,`8q) involved in this way are distinct elements in D 0 . We denote by D 0 pγq the set of bounded off-diagonal intervals pb 0 , d 0 q P D 0 that are not in the collection tb 0,i , d 0,i u m i"1 . Since the maximum of smooth functions over R 2m`n is smooth 13 on a generic subset of R 2m`n , the map cpγq :D P R 2m`n Þ Ñ maxp}γpDq} 8 , t |d 0´b0 | 2 u pb0,d0qPD0pγq q P R is itself C 8 on a generic subset of R 2m`n , with gradients either equal to 0 or as in piq or piiq of Proposition B.1. Let Γ m be the set of ordered matchingsγ : R 2m`n Ñ R 2m`n , which is non-empty and finite. Then the map d D0,m :D P R 2m`n Þ Ñ miñ γPΓmc pγqpDq P R is C 8 on a generic subset of R 2m`n , with gradients either equal to 0 or as in piq or piiq of Proposition B.1.
We will be done if we can show that the two maps d D0˝Qm,n andd D0,m are equal over R 2m`n . Fix an ordered barcodeD P R 2m`n and let D :" Q m,n pDq. Letγ : R 2m`n Ñ R 2m`n be an ordered matching. The components ofγ determine a matching γ between D and D 0 , sending pb i , d i q onto pb 0,i , d 0,i q and pv j ,`8q onto pv 0,j ,`8q. By definition of the cost of a matching 2.10 and Equation (30), we have cpγq "cpγqpDq. This yieldsd D0,m pDq ě d D0 pDq " d D0˝Qm,n pDq. Conversely, among the optimal matchings from D to D 0 , it is always possible to find one that sends off-diagonal points of D (and D 0 ) on the diagonal only by orthogonal projection. This allows us to lift γ at the level ofD and to define an ordered matchingγ such thatcpγqpDq " cpγq. This yieldsd D0,m pDq ď d D0 pDq " d D0˝Qm,n pDq and therefore d D0˝Qm,n "d D0,m on R 2m`n .
We cannot directly use Lemma B.2 to prove Proposition B.1. As a matter of fact, by the definition of 8-differentiability (Definition 3.10), Proposition B.1 is asking that for generic D P Bar n all the maps d D0˝Qm,n , for varying m P N, should be smooth at pre-images of D. However, Lemma B.2 only guarantees that the maps d D0˝Qm,n , taken individually, are smooth over generic subsets of R 2m`n , and it is not clear a priori how to glue at the level of barcodes these generic subsets lying in different spaces of ordered barcodes R 2m`n . In order to leverage Lemma B.2, we devise intermediate results that infer the smoothness of the maps d D0˝Qm 1 ,n from the knowledge of the smoothness of a well-chosen map d D0˝Qm,n . The high-level intuition of each of these intermediate steps is as follows: 1. Infinitesimal perturbations of a given diagram D can be understood as infinitesimal moves of the off-diagonal points of D, together with appearances of small intervals from the diagonal. In Lemma B.3, we devise a generic condition on D ensuring that these new small off-diagonal intervals appearing when perturbing D do not play any role in the bottleneck distance to D 0 .
2. Given a barcode D, we take a pre-imageD m P Q´1 m,n pDq of D which is minimal in the sense that its pairs of adjacent components are not trivial, i.e not of the form pb, bq. In other words,D m is an ordering of the endpoints of off-diagonal intervals appearing in D without extra pairs pb, bq lying on the diagonal. Up to an infinitesimal perturbation ofD m , Lemma B.2 ensures that d D0˝Qm,n is smooth in an open neighborhood ofD m . It is easy to observe that for any other pre-imageD m 1 of D, the components of the ordered barcodẽ D m 1 only differ with those ofD m by the addition of trivial pairs of the form pb, bq. According to the previous item, those trivial pairs do not play any role when computing the bottleneck distance to D 0 . Therefore, since d D0˝Qm,n is smooth in a neighborhood ofD m , the map d D0˝Qm 1 ,n is itself smooth in an open neighborhood ofD m 1 . We make these intuitions rigorous in Lemma B.4.
3. The previous arguments allow us to construct open balls BpD m 1 , q of the same radius ą 0 around all pre-imagesD m 1 P R 2m 1`n of a generic diagram D P Bar over which all maps d D0˝Qm 1 ,n are smooth. To conclude that d D0 itself is 8-differentiable in a neighborhood of D, we show in Lemma B.5 that the -bottleneck ball around D is covered by the union of the images of the balls BpD m 1 , q.
We say that an ordered barcodeD m " rpb i , d i q m i"1 , pv j q n j"1 s P R 2m`n is minimal if b i ‰ d i for 1 ď i ď m. This terminology is justified by the fact that the image D :" Q m,n pD m q P Bar n contains exactly m bounded-off diagonal intervals and n unbounded ones, and therefore any other pre-imageD m 1 P R 2m 1`n of D must lie in a space of ordered barcodes of dimension at least 2m`n (i.e. m 1 ě m). We show that under suitable assumptions, the differentiability of all the maps d D0˝Qm 1 ,n at pre-imagesD m 1 of D can be inferred from the differentiability of d D0˝Qm,n at the minimal pre-imageD m . Lemma B.4. For every m P N, the set of minimal ordered barcodes in R 2m`n is open. Moreover, given a minimal D m P R 2m`n with D :" Q m,n pD m q PBar, if d D0˝Qm,n is C 8 in an open neighborhood ofD m , then there is an ą 0 such that for all other pre-imagesD m 1 of D, the map d D0˝Qm 1 ,n is C 8 in BpD m 1 , q, with gradients as in piq or piiq of Proposition B.1.
Proof. Is is clear that the set of minimal ordered barcodes in R 2m`n is open. We address the second part of the Lemma. LetD m P R 2m`n be a minimal ordered barcode such that D :" Q m,n pD m q PBar, and assume there is an open neighborhood U ofD m within which d D0˝Qm,n is C 8 . By continuity of the quotient map and from the fact thatBar is open, we can assume without loss of generality that Q m,n pU q is contained inBar.
For any other pre-imageD m 1 P R 2m 1`n of D, i.e. an ordered barcode such that Q m 1 ,n pD m 1 q " D " Q m,n pD m q, the first m 1 adjacent pairs of components ofD m 1 must describe in an arbitrary order the m bounded off-diagonal points of D together with m 1´m trivial pairs of the form pb, bq. The last n components ofD m 1 must be in correspondance with the left endpoints of infinite intervals in D. In other words, the first 2m 1 components ofD m 1 consist of a re-ordering of the first 2m components ofD m , together with m 1´m trivial pairs of the form pb, bq. The last n components ofD m 1 consist of a re-ordering of those ofD m .
To every pre-imageD m 1 of D as above, we associate the linear projection L m 1 ,m : R 2m 1`n Ñ R 2m`n that sendsD m 1 toD m by re-arranging the m non trivial pairs of components and the n last components, and forgetting the m 1´m trivial pairs. Since D PBar, Lemma B.3 provides an ą 0 such that for any D 1 P BpD, q, the points of D 1 that are in ∆ may be sent onto the diagonal when computing the bottleneck distance from D 1 to D 0 , and furthermore they can be disregarded when computing d 8 pD 1 , D 0 q. Therefore, using that the quotient map Q m 1 ,n is 1-Lipschitz, we know that for any pre-imageD m 1 of D andD 1 m 1 P BpD m 1 , q, the m 1´m pairs of componentsD 1 m 1 with persistence less than can be disregarded when computing d D0˝Qm 1 ,n pD 1 m 1 q. Formally, for every m 1 P N, @D m 1 P Q´1 m 1 ,n pDq, @D 1 m 1 P BpD m 1 , q, d D0˝Qm 1 ,n pD 1 m 1 q " d D0˝Qm,n˝Lm 1 ,m pD 1 m 1 q.
Note that the maps L m 1 ,m are 1-Lipschitz. Therefore, we can reduce in order to ensure that L m 1 ,m pBpD m 1 , qq Ă U for every pre-imageD m 1 of D. Applying the chain rule on d D0˝Qm,n and L m 1 ,m -which is an affine map hence C 8 -in Equation (31), we obtain that all the maps d D0˝Qm 1 ,n are C 8 in BpD m 1 , q. Also by the chain rule, by definition of L m 1 ,m , the components of the gradients of the maps d D0˝Qm 1 ,n are a re-ordering of the components of the gradient of d D0˝Qm,n . By Lemma B.2, the gradient of the latter is either 0 or as in piq or piiq of Proposition B.1. However, the gradient of d D0˝Qm,n being 0 at some elementsD 1 m P U would mean that the bottleneck distance d 8 pQ m 1 ,n pD 1 m q, D 0 q equals the distance of some off-diagonal interval pb 0 , d 0 q to its diagonal projection, which is impossible since Q m 1 ,n pD 1 m q PBar.
By means of Lemma B.4, we can deduce at once the differentiability of all the maps d D0˝Qm 1 ,n over balls of the same radius. We need a last result that connects these balls to an actual open neighborhood of D in Bar n . Lemma B.5. For any D P Bar n , there exists an ą 0 such that for every m 1 P N, Q´1 m 1 ,n pBpD, qq Ď ď D m 1 PR 2m 1`n ,Q m 1 ,n pD m 1 q"D BpD m 1 , q.
Proof. Let D P Bar n , and η ą 0 be less than all the pairwise distances between geometrically distinct off-diagonal points in D, and less than all the distances from off-diagonal points in D to the diagonal. We take ą 0 such that ă η 2 . Let D 1 P BpD, q. Then, for every off-diagonal point pb, dq of D, the number of (off-diagonal) points of D 1 lying in Bppb, dq, q equals the multiplicity of pb, dq in D. Let us say that these points in D 1 are of type (a). The points of D 1 that are not in the balls Bppb, dq, q, for pb, dq ranging over off-diagonal intervals of D, must be -close to the diagonal, and we say that these points are of type (b). Note that we can accordingly characterize the components of a pre-imageD 1 m 1 P Q´1 m 1 ,n pD 1 q: the pairs of components inD 1 m 1 must either be trivial (i.e of the form pb, bq), or equal to some off-diagonal point of type (a) or (b). All off-diagonal points of D 1 , of type (a) or (b), counted with multiplicity, must appear as a pair inD 1 m 1 . Given such a pre-imageD 1 m 1 P Q´1 m 1 ,n pBpD, qq of D 1 , we construct another ordered barcodeD m 1 P R 2m 1`n by modifying the components ofD 1 m 1 at cost less than (i.e such that }D 1 m 1´Dm 1 } 8 ă ) as follows: • The last n components ofD 1 m 1 parametrize the left endpoints of infinite intervals in D 1 . We change them at cost less than into the left endpoints of infinite intervals in D.
• If a pair pb 1 , d 1 q among the first m 1 pairs of components ofD 1 m 1 is of type (a), it is -close to a unique off-diagonal point pb, dq of D. We change it into pb, dq.
• If a pair pb 1 , d 1 q among the first m 1 pairs of components ofD 1 m 1 is of type (b), it is -close to the diagonal. We transform it into p b 1`d1 2 , b 1`d1 2 q.
• The remaining pairs in the first m 1 pairs of components ofD 1 m 1 must be trivial, and we leave them unchanged.
In this way, we have constructed an ordered barcodeD m 1 such thatD 1 m 1 P BpD m 1 , q and also, by construction,D m 1 is a pre-image of D, i.e Q m 1 ,n pD m 1 q " D.
We are now ready to prove Proposition B.1.
Proof of Proposition B.1. Consider the set of barcodes D P Bar n that admit an open neighborhood within which d D0 is 8-differentiable. By definition, this set is open in Bar n , and we are left to show that it is also dense. Given an arbitrary D P Bar n , we will perform a series of infinitesimal perturbations of D, so that there exists a (small) open neighborhood U of D over which d D0 is 8-differentiable.
SinceBar is generic in Bar n , up to an infinitesimal perturbation, we can assume that D lies inBar. LetD m P R 2m`n be a minimal pre-image of D. By Lemma B.4, the set of minimal ordered barcodes in R 2m`n is open. Moreover, d D0˝Qm,n is smooth on a generic subset of R 2m`n by Lemma B.2. Therefore, up to an infinitesimal perturbation of D m (which results in an infinitesimal perturbation of D by continuity of Q m,n ), we can further assume that d D0˝Qm,n is smooth on a ball BpD m , q for some ą 0, whileD m remains minimal and D stays inBar.
Reducing if necessary, by Lemma B.4 all the maps d D0˝Qm 1 ,n are smooth over BpD m 1 , q, with gradients as in piq or piiq of Proposition B.1, whereD m 1 ranges over the pre-images of D. Reducing further if necessary, we conclude that d D0 is 8-differentiable over BpD, q by Lemma B.5.