Skip to main content

Persistence diagrams with linear machine learning models

Abstract

Persistence diagrams have been widely recognized as a compact descriptor for characterizing multiscale topological features in data. When many datasets are available, statistical features embedded in those persistence diagrams can be extracted by applying machine learnings. In particular, the ability for explicitly analyzing the inverse in the original data space from those statistical features of persistence diagrams is significantly important for practical applications. In this paper, we propose a unified method for the inverse analysis by combining linear machine learning models with persistence images. The method is applied to point clouds and cubical sets, showing the ability of the statistical inverse analysis and its advantages.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Notes

  1. 1.

    A topological space X with \(\tilde{H}_{q}(X)=0\) for any q is called acyclic, where \(\tilde{H}_q(X)\) is the reduced homology of X.

  2. 2.

    A multiset is a set with multiplicity of each point.

  3. 3.

    In Robins et al. (2016), the birth/death positions are called critical points.

  4. 4.

    CGAL: https://www.cgal.org/ (Da. et al. 2017).

  5. 5.

    Scikit-learn: http://scikit-learn.org/ (Pedregosa et al. 2011).

  6. 6.

    http://www.wpi-aimr.tohoku.ac.jp/hiraoka_labo/homcloud/index.en.html.

  7. 7.

    DIPHA: A Distributed Persistent Homology Algorithm (Bauer et al. 2014).

  8. 8.

    SciPy: Open Source Scientific Tools for Python, 2001-, http://www.scipy.org/ (Jones et al. 2011–).

  9. 9.

    https://opencv.org/

  10. 10.

    In the paper (Kimura et al. 2017), images in the final stage are also used. In this paper, we only use early and intermediate stage images to focus on the initial changes in the reaction.

References

  1. Adams, H., Chepushtanova, S., Emerson, T., Hanson, E., Kirby, M., Motta, F., Neville, R., Peterson, C., Shipman, P., Ziegelmeier, L.: Persistence images: a stable vector representation of persistent homology. J. Mach. Learn. Res. 18(8), 1–35 (2017)

    MathSciNet  MATH  Google Scholar 

  2. Bauer, U., Kerber, M., Reininghaus, J.: Distributed computation of persistent homology. Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments (ALENEX) (2014)

  3. Bauer, U., Kerber, M., Reininghaus, J., Wagner, H.: Phat—persistent homology algorithms toolbox. J. Symb. Comput. 78, 76–90 (2017)

    MathSciNet  Article  MATH  Google Scholar 

  4. Bingham, N.H., Fry, J.M.: Regression—Linear Models in Statistics. Springer, Berlin (2010)

    MATH  Google Scholar 

  5. Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Berlin (2007)

    Google Scholar 

  6. Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16(1), 77–102 (2015)

    MathSciNet  MATH  Google Scholar 

  7. Buchet, M., Hiraoka, Y., Obayashi, I.: Persistent homology and materials informatics. In: Tanaka, I. (ed.) Nanoinformatics, pp. 75–95. Springer, Berlin (2018)

    Chapter  Google Scholar 

  8. Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46, 255–308 (2009)

    MathSciNet  Article  MATH  Google Scholar 

  9. Chazal, F., Glisse, M., Labruére, C., Michel, B.: Convergence rates for persistence diagram estimation in topological data analysis. J. Mach. Learn. Res. 16, 3603–3635 (2015)

    MathSciNet  MATH  Google Scholar 

  10. Chan, J.M., Carlsson, G., Rabadan, R.: Topology of viral evolution. PNAS 110(46), 18566–18571 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  11. Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discret. Comput. Geom. 37(1), 103–120 (2007)

    MathSciNet  Article  MATH  Google Scholar 

  12. Csurka, G., Bray, C., Dance, C. Fan, L.: Visual categorization with bags of keypoints. In: Proceeding of ECCV Workshop on Statistical Learning in Computer Vision, pp. 59–74 (2004)

  13. Da, T.K.F., Loriot, S., Yvinec, M.: 3D Alpha Shapes. CGAL User and Reference Manual 4.11, CGAL Editorial Board (2017)

  14. Delgado-Friedrichs, O., Robins, V., Sheppard, A.: Morse theory and persistent homology for topological analysis of 3D images of complex materials. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 4872–4876 (2014)

  15. Delgado-Friedrichs, O., Robins, V., Sheppard, A.: Skeletonization and partitioning of digital images using discrete Morse theory. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 654–666 (2015)

    Article  Google Scholar 

  16. de Silva, V., Ghrist, R.: Coverage in sensor networks via persistent homology. Algebraic Geom. Topol. 7, 339–358 (2007)

    MathSciNet  Article  MATH  Google Scholar 

  17. Dey, T.K., Hirani, A.N., Krishnamoorthy, B.: Optimal homologous cycles, total unimodularity and linear programming. SIAM J. Comput. 40(4), 1026–1044 (2011)

    MathSciNet  Article  MATH  Google Scholar 

  18. Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. Discret. Comput. Geom. 28(4), 511–533 (2002)

    MathSciNet  Article  MATH  Google Scholar 

  19. Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. AMS, Providence (2010)

    MATH  Google Scholar 

  20. Escolar, E.G., Hiraoka, Y.: Optimal cycles for persistent homology via linear programming. Optimization in the Real World Toward Solving Real-World Optimization Problems, pp. 79–96. Springer Japan, Osaka (2016)

    Google Scholar 

  21. Fasy, B.T., Lecci, F., Rinaldo, A., Wasserman, L., Balakrishnan, S., Singh, A.: Confidence sets for persistence diagrams. Ann. Stat. 42(6), 2301–2339 (2014)

    MathSciNet  Article  MATH  Google Scholar 

  22. Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Nat. Acad. Sci. USA 113, 7035–7040 (2016)

    Article  Google Scholar 

  23. Ichinomiya, T., Obayashi, I., Hiraoka, Y.: Persistent homology analysis of craze formation. Phys. Rev. E 95(1), 012504 (2017)

    Article  Google Scholar 

  24. Jones, E., Oliphant, T., Peterson, .P, et al.: SciPy: Open source scientific tools for Python. http://www.scipy.org/ (2001–) [Online; accessed 2018-01-20]

  25. Kaczynski, T., Mischaikow, K., Mrozek, M.: Computational Homology. Springer, Berlin (2004)

    Book  MATH  Google Scholar 

  26. Kimura, M., Obayashi, I., Takeuchi, Y., Hiraoka, Y.: Finding trigger sites in heterogeneous reactions using persistent-homology without preliminary material scientific information. Sci. Rep. 8, 3553 (2018)

    Article  Google Scholar 

  27. Kusano, G., Fukumizu, K., Hiraoka, Y.: Persistence weighted Gaussian kernel for topological data analysis. Proceedings of the 33rd International Conference on Machine Learning, JMLR: W&CP 48. 2004-2013 (2016)

  28. Kusano, G., Fukumizu, K., Hiraoka, Y.: Kernel method for persistence diagrams via kernel embedding and weight factor. Accepted in Journal of Machine Learning Research

  29. Lowe, D.G.: Object recognition from local scale invariant features. In: Proc. of IEEE International Conference on Computer Vision, pp. 1150–1157 (1999)

  30. Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Computer Vision – ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7-13, 2006, Proceedings, Part IV, pp. 490–503 (2006)

  31. Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. arXiv:1506.08903

  32. Pearson, D.A., Bradley, R.M., Motta, F.C., Shipman, P.D.: Producing nanodot arrays with improved hexagonal order by patterning surfaces before ion sputtering. Phys. Rev. E 92(6), 062401 (2015)

    Article  Google Scholar 

  33. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Erplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  34. Rajan, K.: Materials informatics. Mater. Today 8(10), 38–45 (2005)

    Article  Google Scholar 

  35. Rajan, K.: Materials informatics. Mater. Today 15(11), 470 (2012)

    Article  Google Scholar 

  36. Reininghaus, J., Huber, S., Bauer, U., Kwitt, R.: A Stable Multi-Scale Kernel for Topological Machine Learning. 2015 IEEE Conference on Computer Vision and Pattern Recognition, 4741–4748 (2015)

  37. Robert, T.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  38. Robins, V., Turner, K.: Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. Phys. D 334, 99–117 (2016)

    MathSciNet  Article  Google Scholar 

  39. Robins, V., Saadatfar, M., Delgado-Friedrichs, O., Sheppard, A.P.: Percolating length scales from topological persistence analysis of micro-CT images of porous materials. Water Resour. Res. 52(1), 315–329 (2016)

    Article  Google Scholar 

  40. Saadatfar, M., Takeuchi, H., Francois, N., Robins, V., Hiraoka, Y.: Pore configuration landscape of granular crystallisation. Nat. Commun. 8, 15082 (2017). https://doi.org/10.1038/ncomms15082

    Article  Google Scholar 

  41. Sivic, J. and Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: Proc. of IEEE International Conference on Computer Vision, pp.1470–1477 (2003)

  42. Turner, K., Mileyko, Y., Mukherjee, S., Harer, J.: Fréchet means for distributions of persistence diagrams. Discret. Comput. Geom. 52(1), 44–70 (2014)

    Article  MATH  Google Scholar 

  43. Zomorodian, A., Carlsson, G.: Computing persistent homology. Discret. Comput. Geom. 33(2), 249–274 (2005)

    MathSciNet  Article  MATH  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ippei Obayashi.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

This work is partially supported by JSPS KAKENHI Grant Number JP 16K17638, JST CREST Mathematics15656429, JST “Materials research by Information Integration” Initiative (MI2I) project of the Support Program for Starting Up Innovation Hub, Structural Materials for Innovation Strategic Innovation Promotion Program D72 and D66, and New Energy and Industrial Technology Development Organization (NEDO).

A Algorithm for generating random images

A Algorithm for generating random images

The algorithm for generating random binary images is given by Algorithm 2. It consists of six parameters, \(W, N, S\in \mathbb {N}, \sigma _1> 0, \sigma _2 > 0\), and \(t >0\). The area of white pixels in the generated image is given by the orbits of the Brownian motion of N particles on a flat torus with the size \(W \times W\). The parameters S and \(\sigma _1\) determine the length of each orbit and \(\sigma _2\) and t determine the radii of particles. In this paper we fix \(W=300\), \(\sigma _1 = 4\), \(\sigma _2 = 2\), \(t = 0.01\), and only N and S are changed. When N and S become larger, the generated image tend to have more white pixels.

These kinds of random images are frequently obtained by experimental measurements in materials science such as X-CT and TEM (Kimura et al. 2017). These seemingly disordered images are supposed to be utilized for materials informatics, and one of the motivations of this paper is to develop a universal framework for this purpose.

figureb

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Obayashi, I., Hiraoka, Y. & Kimura, M. Persistence diagrams with linear machine learning models. J Appl. and Comput. Topology 1, 421–449 (2018). https://doi.org/10.1007/s41468-018-0013-5

Download citation

Keywords

  • Topological data analysis
  • Persistent homology
  • Machine learning
  • Linear models
  • Persistence image

Mathematics Subject Classification

  • 55-04
  • 55U99
  • 62P35
  • 62J07