Field Choice Problem in Persistent Homology

This paper tackles the problem of coefficient field choice in persistent homology. When we compute a persistence diagram, we need to select a coefficient field before computation. We should understand the dependence of the diagram on the coefficient field to facilitate computation and interpretation of the diagram. We clarify that the dependence is strongly related to the torsion part of Z\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {Z}$$\end{document} relative homology in the filtration. We show the sufficient and necessary conditions of the independence of coefficient field choice. An efficient algorithm is proposed to verify the independence. A slight modification of the standard persistence algorithm gives the verification algorithm. In a numerical experiment with the algorithm, a persistence diagram rarely changes even when the coefficient field changes if we consider a filtration in R3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^3$$\end{document}. The experiment suggests that, in practical terms, changes in the field coefficient will not change persistence diagrams when the data are in R3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^3$$\end{document}.

the coefficient field changes if we consider a filtration in R 3 . The experiment suggests that, in practical terms, changes in the field coefficient will not change persistence diagrams when the data is in R 3 .

Introduction
Topological data analysis (TDA) (Edelsbrunner and Harer, 2010;Carlsson, 2009) is, as the name suggests, the application of topology to data analysis. Persistent homology (Edelsbrunner et al., 2002;Zomorodian and Carlsson, 2005) is one of the most important tools for TDA. In persistent homology, by encoding information on length scales in filtrations, we can capture characteristic geometric features with multiple length scales. By using filtrations, persistent homology is also robust to noise. Homology itself is translation and rotation invariant, and so persistent homology is similarly invariant. These properties are suitable for the analysis of shapes of data, and persistent homology is applied in various practical data analysis contexts in domains such as biology (Chan et al., 2013), image processing (Hu et al., 2019), and materials science (Hiraoka et al., 2016;Saadatfar et al., 2017;Ichinomiya et al., 2017;Kimura et al., 2018).
To describe our problem, we first define persistent homology. Persistent homology is defined on a filtration, an increasing sequence, of topological spaces. We consider the following filtration: where T is {0, 1, · · · , N } or R + . The qth persistent homology H q (X; k) with a coefficient ring k is defined as follows: where φ t s : H q (X s ; k) → H q (X t ; k) is the homology map induced by the inclusion map X s → X t . In standard homology theory, we use Z as a coefficient ring since the universal coefficient theorem ensures that Z-homology provides the most information about homology. However, in the theory of persistent homology a field is used instead of Z since the interval decomposition described below is crucial for analysis of persistent homology and the decomposition is guaranteed only when k is a field. Indeed, the structural theory of persistent homology ensures the existence and uniqueness of the following decomposition of H q (X; k) called the interval decomposition if k is a field: This theorem depends on the fact that k[z], the polynomial ring with a field coefficient, is PID (Zomorodian and Carlsson, 2005), and so this theorem does not hold for k = Z. When this interval decomposition is given, we define the qth persistence diagram (PD) D q (X; k) as a multiset of pairs of endpoints of the intervals. That is, D q (X; k) = {(b m , d m )} L i=1 . Each pair is called a birth-death pair. Each b m andd m are birth time and death time, respectively, and d m − b m is called a lifetime. Since a birth-death pair with a long lifetime corresponds to a "stable" homological structure in the filtration, we can use lifetimes to compare the significance of birth-death pairs.
Normally, we choose k as one of R, Q, and Z p for a prime p. Z 2 is most often used since it is amenable to a fast algorithm and an intuitive interpretation. Here we face the problem of the choice of k. If any k gives the same PD, there is no problem. However, this is not practical, because the dimensions of homology vector spaces for the same topological space are different when the Z-homology group of the space has non-zero torsion. If a Klein bottle appears in a filtration, the PDs for k = Z 2 and k = R are clearly different. For analysis of persistent homology with torsions, Boissonnat and Maria (2014) proposed an efficient algorithm to compute PD for multiple coefficient fields by utilizing the Chinese remainder theorem. Then, the following questions naturally arise.
-What condition ensures the independence of the choice of the field k? -Is there an efficient algorithm to check the above condition? -How often does D q (X; k) change as the field changes k? -When D q (X; k) changes depending of the choice of k, how does D q (X; k) change?
In this paper, we offer complete answers for the first and second questions, and partial answers for the third and fourth questions.

Results
To describe the results of the paper, we give some assumptions. We always assume the finiteness of the filtration. A filtration is finite if X = ∪ t X t is a finite simplicial/cell/cubical complex. This condition ensures the existence and uniqueness of the interval decomposition (Zomorodian and Carlsson, 2005). This assumption is reasonable since an infinite filtration cannot be represented on a computer and we cannot use such a filtration in practical applications.
To consider field choice problems, we always restrict the candidates of a field to C, R, Q, and Z p for a prime p.
Question 1 When is D q (X; k) independent of the choice of the field k?
To consider Question 1, it is desirable for the following proposition to hold since H q (X t ; Z) is often free for every t in practical cases.
Proposition 1 (Incorrect!) If H q (X t ; Z) is free for every t ∈ T , the persistent homology H q (X; k) has the same decomposition for any field k.
However, we have a counterexample of this proposition (Fig. 1). Let M be a Möbius strip and ∂M be its boundary. Both H 1 (∂M ; Z) and H 1 (M ; Z) are isomorphic to Z, and the homomorphism is isomorphic to n ∈ Z → 2n ∈ Z and the interval decomposition on R and Z 2 gives the different decomposition as follows: In this example, both H 1 (∂M ; Z) and H 1 (M ; Z) are free, but H 1 (M, ∂M ; Z) Z 2 and this is not free. This fact is key to the different diagrams. Section 3 shows some other examples.

Fig. 1 Möbius strip and its boundary
We present the following theorem.
Theorem 1 D q (X; k) is independent of the choice of k if H q (X n , X m ; Z) is free for any 0 ≤ m < n ∈ T and H q−1 (X n ; Z) is free for any n ∈ T .
This theorem yields the following corollaries.
Corollary 1 D 0 (X; k) is always independent of the choice of k.
Corollary 2 When X is a filtration of finite cell/simplicial/cubical complexes embedded in R M , the (M − 1)th persistent homology gives the same PD among any fields k.
Corollary 1 derives from the fact that H −1 (·) = 0 and H 0 (X n , X m ; Z) is free for any n > m. Corollary 2 is proved in Section 4.1. The above two corollaries ensure that if a filtration is embedded in R 2 , all non-trivial persistence diagrams D 0 and D 1 do not depend on the choice of the coefficient field.
We also have the following theorem which provides the sufficient condition for the freeness of H q (X n , X m ; Z).
Theorem 2 For a given q, H q (X n , X m ; Z) are free for any 0 ≤ m < n ∈ T if D q (X; k) is independent of the choice of k and H q−1 (X n ; Z) is free for any n ∈ T .
From the above two theorems, we have the following corollary.
Corollary 3 Let M be a non-negative integer. D q (X; k) is independent of the choice of k for all q = 0, . . . , M if and only if H q (X n , X m ; Z) are free for any 0 ≤ m < n ∈ T , and q = 0, . . . , M .
From the above discussion another question arises.
Question 2 Is there an efficient algorithm for checking the condition of Corollary 3?
Such an algorithm would be useful to provide information as to whether we should be concerned about field choice. Of course, we can compute relative homology groups for all m < n on a computer, but that would be cumbersome and inefficient because the number of possible pairs (m, n) is (N + 1)N/2. The computation cost (time complexity) is O(N 2 G), where G is the average cost of computing H q (X n , X m ; Z). It is known that the time complexity of computing a PD is O(G) 1 .
To describe the algorithm, we assume the following condition.
With this setting, we consider the filtration of complexes X : ∅ = X 0 ⊂ X 1 ⊂ · · · ⊂ X N . Since the filtration is finite, we can transform the persistence decomposition problem into the problem under Condition 1.
The following theorem is proved in Section 6.
Theorem 3 There is an algorithm for judging the condition in Corollary 3 whose time complexity is the same as the algorithm for computing a PD.
The algorithm is shown in Algorithm 2 in Section 6. In Section 7, we apply the algorithm to some examples shown in Section 3 and demonstrate that it performs well. A performance benchmark is also covered in that section. We now pose the following additional question.
Question 3 How often do we face filtrations with non-trivial torsion subgroups?
We can construct such an example by a Möbius strip as shown above, but would we often face such a filtration? To demonstrate the probability of torsions we conduct a numerical experiment for random data in R 3 . From this experiment, we show that filtrations with non-trivial torsion subgroups are very rare. This suggests that, in practical terms, if the data is in R 3 , we do not need to be particularly concerned about the torsion problem. We also conduct another numerical experiment for random filtrations in high dimensional simplex. The second experiment shows that the filtrations with non-trivial torsion subgroups are usual when the space is high dimensional.
The following question is also important.
Question 4 When D q (X; k) changes depending of the choice of k, how does D q (X; k) change?
In the above example about a Möbius strip, a long interval I(1, ∞) is split into two shorter intervals, I(1, 2) and I(2, ∞), when k changes from R to Z 2 . From the example, we expect that a long interval indecomposable tends to be split into shorter intervals when k changes from R to Z p . The following theorem proved in Section 9 partially answers the question.
Theorem 4 Assume that H q (X t ; Z) and H q−1 (X t ; Z) are free for all t and H q (∪ t X t ) = 0. Let f be a C 2 convex function on [0, ∞) with f (0) = 0. Then the following inequality holds: When f is strictly convex, the equality holds if and only if D q (X; R) = D q (X; Z p ).
For f (x) = x r with r > 1, the inequality means where W r is the r-Wasserstein distance. In some sense, the r-Wasserstein distance from the empty diagram indicates the information richness of the diagram. Therefore, D q (X; R) contains richer information than D q (X; Z p ) under the condition of the theorem.
The remainder of the paper is organized as follows. Section 2 reviews the basic concepts of persistent homology. Section 3 shows some examples which exhibit the dependency of PDs to their coefficient fields. Section 4 and Section 5 prove Theorem 1 and Theorem 2. Section 6 presents an algorithm which permits efficient judgement and the proof which testifies to the correctness of the algorithm. Section 7 introduces an implementation of the algorithm in HomCloud. This section also shows the performance benchmark. Section 8 presents numerical experiments to measure the probability of the appearance of non-trivial torsions in random filtrations. Section 9 contains the proof of Theorem 4 and, finally, conclusions are offered in Section 10.

Persistent homology
In this section, we prepare some fundamental concepts for persistent homology.

Filtrations
A filtration is an increasing sequence of topological spaces. One typical filtration is the union of r-balls constructed from a pointcloud in R M . For a pointcloud, a set of finite points {x i }, X r is defined as where B x (r) is the closed ball whose center is x and radius is r. The sequence of X r parameterized by r, {X r } r≥0 , is obviously a filtration. This filtration is used to investigate the shape formed by the pointcloud. For a practical application of persistent homology, we usually use finite simplicial or cubical filtrations since such filtrations are practical to consider on a computer. One well-known filtration is a Čech filtration. The Čech complex ech(P, r) of a pointcloud P = {x i } with radius parameter r ≥ 0 is defined as follows: (2) The filtration {ech(P, r)} r≥0 is called a Čech filtration. From the nerve theorem, ech(P, r) is homotopic to ∪ i B xi (r) and we can use the Čech filtration to investigate the union of r-balls. There are many simplices in a Čech complex for a large pointcloud and we usually use an alpha complex (Edelsbrunner and Mücke, 1994;Edelsbrunner, 1995) instead since the alpha complex is homotopic to the Čech complex and the number of simplices of the alpha complex is much smaller than the Čech complex. The alpha complex has another advantage in that it can be embedded in R M but such embedding is impossible for the Čech complex. When a filtration is finite, it is essentially time-discrete even if T = R + . Therefore we assume T = {0, . . . , N } for the proofs of this paper except Theorem 4. In addition, under this assumption, it is straightforward to configure a filtration satisfying Condition 1 by ordering simplices appropriately; hence, we can assume the condition without loss of generality. Since Condition 1 is useful to describe algorithms, we sometimes assume this and consider the filtration

Computation of a persistence diagram
Under Condition 1, Algorithm 1 computes the PD of the filtration (Edelsbrunner et al., 2002;Zomorodian and Carlsson, 2005;Otter et al., 2017). To simplify the algorithm, all simplices of all dimensions are mixed and in the output all birth-death pairs of all degrees are also mixed. In this algorithm, where B is a matrix and j is an integer. Furthermore, in this algorithm, matrix B is reduced from left column to right column. After terminating the algorithm, the PD is computed as follows: whereB is the matrix returned by the algorithm. The qth PD is given from D(X) as follows: Algorithm 1 Algorithm to compute persistence diagrams Justification for the algorithm is provided in Appendix A. Indeed, Theorem 3 shows that the algorithm for judging the condition of Corollary 3 is given by restricting Algorithm 1 to integer coefficients. Therefore, the time complexity of the Theorem 3 algorithm is as per Algorithm 1.

Persistent Betti number
From the definition of a PD, we have the following relationship between the map H q (X m ; k) → H q (X n ; k) and a PD: This β n m (k) is called a persistent Betti number or a rank invariant. Hence, the following identity holds: When d = ∞, the following equation holds instead: The next lemma follows directly from the foregoing.

Universal Coefficient Theorem
The universal coefficient theorem is fundamental for homology theory and plays an important role in this paper. We review the theorem here to foreground what follows.
The universal coefficient theorem for homology is as follows (Hatcher, 2002).
Theorem A Let X be a topological space, k a field, and q ≥ 0. The following sequence is a natural short exact sequence: Furthermore, this sequence splits, though not naturally.
We use the above theorem in the following form.
Theorem B Let X and Y be topological spaces, f : X → Y a continuous map, k a field, and q ≥ 0. If H q−1 (X; Z) and H q−1 (Y ; Z) are free, the following commutative diagram holds: This theorem states that the induced map f * : and H q−1 (Y ; Z) are free. We use the theorem for an inclusion map between simplicial/cell/cubical complexes.

Examples of diagrammatic changes induced by coefficient field changes
In this section, we will give some examples of persistent homology, whose interval decomposition depends on the choice of coefficient field.
Example 1 Let S 1 be a circle. We consider a filtration X : where S 1 ∨ S 1 is a bouquet of 2-circles, f = 1 1 and g = 1 1 . By taking the 1st homology of this filtration, we obtain the 1st persistent homology . Thus, the interval decomposition of the 1st persistent homology of X depends on the choice of coefficient field.
Note that if we consider a bouquet of p-circles for a prime p, then we obtain the 1st persistent homology, which has different decompositions over Z p and R.
By using Example 1, we can consider the 1st persistent homology, whose interval decomposition depends on the choice of characteristic p > 0.
be the 1st persistent homology. Then M has the following interval decomposition:  The following proposition is required to prove the theorem.
Proposition 2 If H q (X n , X m ; Z) is free, coker (φ n m : H q (X m ; Z) → H q (X n ; Z)) is also free.
Proof We have the following long exact sequence for the pair (X n , X m ): where ψ n m is induced by canonical projection. Therefore, we have the following relationship between coker (φ n m ) and H q (X n , X m ; Z).
To complete the proof, we show that im ψ n m is free, and this derives from the following well-known theorem.
Theorem C Any sub-module of a free Z-module is also free.
From the assumption of the theorem, H q (X m ; Z) = H q (X m , X 0 ; Z) is free for all m. Hence, φ n m : H q (X m ; Z) → H q (X n ; Z) is a homomorphism between two finitely generated free Z-modules and the map has a Smith normal form (SNF). That is, by taking an appropriate basis, φ n m can be represented by the following Z matrix: where 0 < α k ∈ Z and α k | α k+1 for any k. Then from Theorem B, the following relationship holds: From (13) and (14), we know that β n m (k) is independent of the choice of k if and only if α 1 = · · · = α K = 1. From SNF, we also have the following: This means that α 1 = · · · = α K = 1 if and only if coker (φ n m ) is free and the condition is shown from Prop 2 and the assumption of the theorem.

Proof of Corollary 2
Standard homology theory (Hatcher, 2002, Corollary 3.46, pp. 256) shows that H M −1 (X n ; Z) and H M −2 (X n ; Z) are free under the condition of this corollary. The above corollary is shown by using the Alexander duality. H M −1 (X n , X m ; Z) is also free since this relative homology group is isomorphic to a 0th relative cohomology group because of the Alexander duality. Therefore, from Theorem 1, this can be applicable to the filtration.

Proof of Theorem 2
The proof is similar to that of Theorem 1, but slightly more complex. We prepare the following proposition.
Proposition 3 H q (X n , X m ; Z) is free if coker (φ n m : H q (X m ; Z) → H q (X n ; Z)) and H q−1 (X m ; Z) are free.
Proof From the long exact sequence for the pair (X n , X m ), we have the following facts: im ∂ is free since H q−1 (X m ) is free. We complete the proof by the following theorem from standard algebra.
Theorem D Let M be a module over Z and N be a sub-module of M . M is finitely generated and free if N and M/N are finitely generated and free.

Proof of Theorem 2
We assume that D q (X; k) is independent of the choice of k. Then from Lemma 1, β n m (k) is independent of k for any m and n. Especially, for any n and q, β n n (k) = dim H q (X n ; k) is independent of k and therefore H q (X n ; Z) is free due to the universal coefficient theorem since H q−1 (X n ; Z) is free. Then φ n m has SNF and coker φ n m is free because of the discussion of the proof of Theorem 1. From the above fact and Proposition 3, we conclude that H q (X n , X m ; Z) is free.

Proof of Corollary 3
From Theorem 1, it is straightforward to show that D q (X; k) is independent of the choice of k for all q = 0, . . . , M if H q (X n , X m ; Z) are free for any 0 ≤ m < n ∈ T , and q = 0, . . . , M .
We can show the converse by induction on q. For q = 0, it is trivial that H 0 (X n , X m ; Z) is free and the induction process proceeds by using Theorem 2.
6 Algorithm to determine the dependency of D q (X; k) on k In this section, we explore an algorithm to judge the existence of non-zero torsion and prove Theorem 3. See Algorithm 2. Now we prove the following facts.
-If the algorithm returns "independent", the given filtration satisfies the condition of Corollary 3. Therefore, D q (X; k) is independent of the choice of k. -If the algorithm returns "dependent", the given filtration does not satisfy the condition of Corollary 3. Therefore, D q (X; k) depends on the choice of k.
Algorithm 2 Algorithm to determine the dependency of D q (X; k) on k let B be the matrix representation of the boundary operator for j = 1, . . . , N do We remark that B L B (i),i in this algorithm is always ±1 at (A), so the division at (A) always applies. This is because the condition is checked at (B).
For the proof, we use ideas presented in the Appendix A. We also use Notation 1 in Appendix A and check whether H(X n , X m ; Z) = M q=0 H q (X n , X m ; Z) has a non-zero torsion subgroup for every m < n. We assume that dim(X) ≤ M + 1 by removing all simplices whose dimensions exceed M + 1.
First we prove T (H q (X n , X m ; Z)) = 0 for every m < n when the algorithm returns "independent". LetB be the matrix B just before returning "independent". Since division is not used in the proof of Fact 2 in Section A.1, the discussion can also apply to Z-homology and we can find a basis of C(X N ; Z), {σ 1 , . . . ,σ N }, satisfying Condition 3 (i), (ii), (iii), and (iv). We can explicitly write the bases of C(X m ; Z) and C(X n , X m ; Z) as follows: {σ 1 , . . . ,σ m } is a basis of C(X m ; Z), {σ m+1 + C(X m ; Z), . . . ,σ n + C(X m ; Z)} is a basis of C(X n , X m ; Z).
Let ∂ n,m : C(X n , X m ; Z) → C(X n , X m ; Z) be the boundary operator on relative chain complexes. From Condition 3 (ii), (iii), and (iv), ker ∂ n,m and im ∂ n,m are both Z free modules and the bases are and Therefore, the basis of H(X n , X m ; Z) = ker ∂ n,m /im ∂ n,m can be written as follows: where [z + C(X m ; Z)] is a homology class of z + C(X m ; Z) in H(X n , X m ; Z). Therefore, H(X n , X m ; Z) is a free Z-module and we complete the proof for the "independent" case. Next we show that there is a pair (m, n) such that T (H(X n , X m ; Z)) = 0 if the algorithm returns "dependent". In that case, condition (B) in the algorithm is true for one j, so let n be that j andB be the matrix B at that time. We consider the subfiltration X n = ∅ = X 0 ⊂ · · · ⊂ X n−1 . Algorithm 2 applies on the subfiltration, so we can find a basis of C(X n−1 ) satisfying Condition 3(i)-(iv). That is, there is a basis of C(X n−1 ), {σ 1 , . . . ,σ n−1 }, and a decomposition of {1, . . . , n − 1}, D n D n E n , such that the following conditions hold.
(a) {σ 1 , . . . ,σ k } is a basis of C(X k ; Z) for any 1 ≤ k < n. (b) ∂σ j = 0 for j ∈ D n . (c) LB is a bijection from D n to D n and ∂σ j =σ LB (j) for any j ∈ D n . (d) ∂σ i = 0 for i ∈ D n E n . Furthermore, since the loop (INNERLOOP) also terminates at j = n, LB(n) is not −∞ and there existsσ n ∈ C(X n ; Z) such that the following conditions hold: C(X n ) = C(X n−1 ) ⊕ σ n , By the same discussion in Section A.1, there exist integers holds. We can also show ∂σ LB (n) = 0 in the same manner as in the proof of the claim in Appendix A.1. Therefore, LB(n) ∈ D n E n , but since LB(n) = LB(i) for any 1 ≤ i < n, LB(n) ∈ D n and we have LB(n) ∈ E n . Now let m := LB(n) − 1. From (a)-(d) and (22), the bases of ker ∂ n,m and im ∂ n,m can be explicitly written as follows: and where p = |B m+1,n |. Using m + 1 ∈ E n , finally we have and The proof for the "dependent" case is completed.
Here, when the algorithm returns "dependent", the number p is displayed. This facilitates understanding the dependency of D(X; Z p ) to a prime p which is a divisor of p.

Algorithm implementation
The judgement algorithm is implemented in HomCloud 2 . The twist algorithm introduced by Chen and Kerber (2011) is used for faster computations. The program correctly judges the existence of the torsion for pointclouds shown in Fig. 4 (a) and (e).

Performance benchmark
In this section, we explore the performance of Algorithm 2. We compare the program implemented in HomCloud and Phat (Bauer et al., 2017) 3 . The Phat code is straightforward and efficient. The input filtration for the performance comparison is an alpha filtration constructed from random 5000 and 50000 points in R 3 . Five trials were undertaken and the average computation time is shown. In Phat, we use twist-algorithm with bit_tree_pivot_column, as recommended by Bauer et al. (2017). The benchmark is executed on a PC with a 1.5 GHz Intel(R) Core(TM) i7-8500Y CPU, 16 GB of memory, and the Debian 10.0 operating system. Both programs run on a single core. Results are shown in Table 1 According to the benchmark, our new program is c. ×1.20 slower than Phat. Phat uses Z 2 as a coefficient field and implements fast arithmetic operations by using bit-wise operations. The technique likely renders Phat faster and we conclude that the performance of our program is roughly as efficient as Phat.

Probability of torsion appearance
Here we measured the probability of the appearance of torsions of random filtrations by a numerical experiment. Corollary 1 and Corollary 2 already ensures the independence of persistence diagrams from k for a filtration embedded in R 2 . Therefore we started from filtrations in R 3 .
We generated a random filtration from a pointcloud sampled from a Poisson point process in [0, 1] 3 . The average number of points is 1000. Thus, a random number k is sampled from the Poisson distribution whose parameter is 1000 and k points are uniformly randomly sampled in [0, 1] 3 . An alpha filtration was computed from the generated pointcloud and the condition was judged by HomCloud. Here, 10000 trials were carried out. Only one filtration had non-trivial torsion; thus, 9999 filtrations had trivial torsion 4 . In sum, it can be stated that a filtration with non-trivial torsion is possible, but very rare. This numerical experiment suggests that there is some mathematical mechanism explaining why a random filtration with non-trivial torsion is quite rare. Exploring this further here is beyond the scope of the current paper.
In contrast Kahle et al. (2018) experimentally showed that torsion subgroups often appeared in random d-complex Y ∼ Y d (n, p), introduced by Linial and Meshulam (2006). We apply our algorithm to random filtrations used in the paper. LetȲ (n) be a simplex on n vertices and Y 0 the (d − 1)-skeleton ofȲ (n). Y k for k = 1, . . . , m is randomly generated by adding a d-simplex to Y k−1 . The d-simplex is uniformly randomly sampled from all d-simplices in Y (n)\Y k−1 . We apply the algorithm to the filtration Y 0 ⊂ Y 1 ⊂ · · · ⊂ Y m . We used d = 2, n = 75, m = 5000. The number of random flirtations was 10000. In the experiment we found that all 10000 random filtrations have non-trivial torsion.
The above two experiments are contrasting. We expect that the difference comes from the dimension of the space. In the first experiment a filtration is embedded in R 3 and in the second experimentȲ (75) can be embedded in R 74 . The above experiments suggest that a random filtration embedded in a higher dimensional space has more non-trivial torsion subgroups in the relative homology groups than a filtration embedded in a lower dimensional space. Exploring the problem further here is also beyond the scope of the current paper.
Anyway, the experiment also suggests that we do not need to be concerned about the coefficient field in most cases if the space is R 3 . Thus, if there are concerns about the field choice problem in future research contexts, our proposed algorithm would be helpful.

Proof of Theorem 4
In this section, we assume the following conditions.
This means that the filtration is assumed to be right-continuous. This condition is not essential and we can prove the theorem if the filtration is leftcontinuous. We assume right-continuous for expository purposes. From this condition and H q (∪ t X t ) = 0, all birth-death pairs can be written as (r k , r ) for 0 ≤ k < ≤ N . Hence, using persistent Betti numbers, we have the following equation: (27) Now we prove the following inequality: In addition, if f is strictly convex, the left-hand side is strictly positive. First we prove (28) under the condition of r +1 − r k+1 ≥ r − r k . In this case, where r +1 − r k+1 ≤ ζ 1 ≤ r +1 − r k and r − r k+1 ≤ ζ 2 ≤ r − r k . Here, from the assumption of r +1 − r k+1 ≥ r − r k and the convexity of f , we have f (ζ 1 ) − f (ζ 2 ) ≥ 0 and the inequality (28). The strict positivity from strict convexity is trivial. When r +1 − r k+1 ≤ r − r k , we can prove the inequality in a similar way by exchanging the role of f (r +1 − r k+1 ) and f (r − r k ) in the foregoing. From the discussion in Section 4, we have for any p, k, and . Furthermore, if D q (X; R) = D q (X; Z p ), there exists k < such that holds. From (27), (28), (30), and (31), we complete the proof of the theorem.

Conclusions
In this paper, we focus on mathematical phenomena concerning the change of the coefficient field in persistent homology. We show that the torsion of relative homology groups H q (X n , X m ; Z) plays an essential role for the phenomena. We also propose an algorithm to judge the independence of the field change. The algorithm is implemented in software, HomCloud.
Using the algorithm, we show that the probability of persistence diagrams changing as a result of field changes is not zero, but very low for random pointclouds in R 3 . This suggests that we do not need to be particularly concerned about the choice of the field in most practical persistent homology contexts if we approach persistence diagrams in statistical terms. To assuage researchers' future concerns about this issue, the torsion condition can be checked by the algorithm.
Of course, where torsion structures are important, such as Klein bottles or Möbius strip, the choice of the coefficient field is important. We also suggest that the choice of the coefficient field is important for high dimensional data by the numerical experiment onȲ (75). In such contexts, further study is required into the torsion on the filtrations.
Further, the results herein suggest that the "difficulty" of computation of D q (X; k) depends on the torsion. If all torsions are zero, D q (X; k) for any k is computable by computing D q (X; k) for only one k, for example, Z 2 . If not, to compute D q (X; k) for many k is more onerous. In that case, we can apply the algorithm in Boissonnat and Maria (2014) for faster computation; however, that algorithm simultaneously computes D q (X; k) for multiple, but not for all, k. This phenomenon is not dissimilar to a theorem in Dey et al. (2011). Those authors proved that the difficulty of computing a kind of optimization problem on homology algebra depends on the existence of the non-zero torsion subgroup of the relative homology group. Integer programming on homology algebra can be solved by linear programming if the torsion-free condition holds.
Integer programming requires much more time than linear programming in the sense of computational complexity theory. Of course, our paper and Dey et al.
(2011) concern different problems, but the results are similar because of the shared focus on the torsions of relative homology. These results suggest that the existence of non-trivial torsion subgroups in relative homology renders the problems of computational homology more difficult.

A Algorithm 1
Here we prove the structural theorem of persistent homology and show why Algorithm 1 yields the correct decomposition. Notation and facts from Section 2 are used as necessary.
Theorem E Let k be a field and assume that a simplicial filtration X = ∅ = X 0 ⊂ · · · ⊂ X N satisfies Condition 1. Then the persistent homology has a unique interval decomposition as follows: where Hq(X k ; k) is a direct sum of qth PD, dim(X) is the maximum dimension of the simplices in X, H(X k ; k) → H(X k+1 ; k) is the homology homomorphism induced by the inclusion map X k → X k+1 , bm ∈ {1, . . . , N }, dm ∈ {2, . . . , N } ∪ {∞}, bm < dm, and k 1 − → k is the identity map on k. Moreover Algorithm 1 gives the PD as shown in (4).
To show the theorem, we prepare some notation. Cq(X k ) Hq(X k ) -[·] k is a homology class in H(X k ) -B 0 is the boundary matrix of ∂ = ∂ N with respect to the basis {σ 1 , . . . , σ N } To start the proof, we consider the meaning of an interval indecomposable I(b, d).
-A new homology generator is born at H(X b ). This means that a cycle exists which satisfies z ∈ Z(X b )\Z(X b−1 ). -The homology generator persists until H(X d−1 ). This means that [z] k ∈ H(X k ) is non-zero for any b ≤ k < d. -The homology generator dies at d. This means that [z] d = 0 ∈ H(X d ), which is equivalent to z ∈ B(X d ).

Condition 3
(i) {σ 1 , . . . ,σ k } is a basis of C(X k ) (ii) ∂σ j = 0 for j ∈ D (iii) LB is a bijection from D to D and ∂σ j =σ LB (j) for any j ∈ D (iv) ∂σ i = 0 for i ∈ D E From Condition 3, we have the following facts.
and a birth-death pair (i, j) -For i ∈ E, the following holds and a birth-death pair (i, ∞) From these facts, it is straightforward that D(X) is given by To prove the theorem, we show the following three facts.
Fact 1 Algorithm 1 always terminates in finite steps.  Fact 1 can be easily shown since, in the while loop in the algorithm, L B (j) is strictly monotonically decreasing and finally L B (j) becomes −∞ or distinct from {L B (i) | i < j}.
Fact 3 is a consequence of the Krull-Schmidt theorem. Here we show a more elementary proof by using persistent Betti numbers. From (6), (7), and (8), we can show that the multiplicity of each birth-death pair is unique if the decomposition exists since the persistent Betti numbers only depend on the rank of the maps H(Xm; k) → H(Xn; k) for all m ≤ n and the ranks themselves are independent of the interval decomposition.

A.1 Proof of Fact 2
To show Fact 2, a detailed investigation of Algorithm 1 is required. Since the left-to-right reduction is equivalent to the multiplication of the following matrix R ij (s) from right, B can be written asB where U is an upper triangular matrix whose diagonal elements are all 1. Since U −1 is also an upper triangular matrix whose diagonal elements are 1, we can show the following fact by elemental matrix calculus.
Now we defineB and L(j) as follows for the proof.
Now we consider the matrixB = U −1 B 0 U . Let {σ 1 , . . . ,σ N } be the basis of C(X N ) given by the change of coordinate matrix U . Since U is upper triangular, the following relation holds for every 1 ≤ k ≤ N .
From the terminating condition of the while loop in Algorithm 1, we also have the following facts.
Hence, ∂σ j can be written as follows: Now we show the following claim.
Claim The L(j)-th column ofB is zero.
Proof We prove the claim by contradiction. We assume that the L(j)-th column ofB is non-zero. By applying ∂ to (42), we have 0 =B L(j),j ∂σ L(j) + 1≤i<L(j)B ij ∂σ i .
Clearly, from (43) From the assumption that the L(j)-th column ofB is non-zero, ∂σ L(j) is non-zero and we have j ∈ I, so I is non-empty. Now we consider L(I) = {L(i) | i ∈ I}. When we consider the maximum of L(I), the index i 0 attaining the maximum is unique due to Fact 5. Therefore, in (45), a term ofσ L(i 0 ) appears only once and the coefficient of the term isB i 0 ,jBL(i 0 ),i 0 . This value is non-zero because of the definition of I and this contradicts (45).
We define D, D , and E as follows: D ={j | the jth column ofB is non-zero}, D ={L(j) | j ∈ D}, E ={j | the jth column ofB is zero and L(i) = j for any i ∈ D }.
From the above claim, D and D have no common element and hence D, D , and E are the decomposition of indices {1, . . . , N }. The map L from D to D is bijective because of Fact 5. Chains {σ 1 , . . . ,σ N } are defined as follows: The set of chains {σ 1 , . . . ,σ N } satisfies Condition 3(i) due to (40) and (47). The conditions (ii), (iii), and (iv) can also be easily shown from the construction of the decomposition and the chains.