Skip to main content
Log in

Signal classification with a point process distance on the space of persistence diagrams

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

In this paper, we consider the problem of signal classification. First, the signal is translated into a persistence diagram through the use of delay-embedding and persistent homology. Endowing the data space of persistence diagrams with a metric from point processes, we show that it admits statistical structure in the form of Fréchet means and variances and a classification scheme is established. In contrast with the Wasserstein distance, this metric accounts for changes in small persistence and changes in cardinality. The classification results using this distance are benchmarked on both synthetic data and real acoustic signals and it is demonstrated that this classifier outperforms current signal classification techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Adcock A, Carlsson E, Carlsson G (2016) The ring of algebraic functions on persistence bar codes. Homol Homotopy Appl 18(1):381–402

    Article  Google Scholar 

  • Adler RJ, Bobrowski O, Weinberger S (2014) Crackle: the homology of noise. Discrete Comput Geom 52(4):680–704

    Article  MathSciNet  Google Scholar 

  • Azimi-Sadjadi MR, Yang Y, Srinivasan S (2007) Acoustic classification of battlefield transient events using wavelet subband features. In: Proceedings of SPIE defense and security symposium, p 6562

  • Bampasidou M, Gentimis T (2014) Modeling collaborations with persistent homology. arXiv preprint arXiv:1403.5346

  • Bauer U (2015) Ripser. https://github.com/Ripser/ripser

  • Bogert BP, Healy MJ, Tukey JW (1963) The quefrency alanysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking. In: Proceedings of the symposium on time series analysis, chapter, vol 15, pp 209–243

  • Bubenik P (2015) Statistical topological data analysis using persistence landscapes. J Mach Learn Res 16(1):77–102

    MathSciNet  MATH  Google Scholar 

  • Carlsson G (2009) Topology and data. Bull Am Math Soc 46(2):255–308

    Article  MathSciNet  Google Scholar 

  • Chazal F, Cohen-Steiner D, Glisse M, Guibas LJ, Oudot SY (2009) Proximity of persistence modules and their diagrams. In: Proceedings of the twenty-fifth annual symposium on Computational geometry. ACM, pp 237–246

  • Cohen-Steiner D, Edelsbrunner H, Harer J, Mileyko Y (2010) Lipschitz functions have \(L_p\)-stable persistence. Found Comput Math 10(2):127–139

    Article  MathSciNet  Google Scholar 

  • Dhanalakshmi P, Palanivel S, Ramalingam V (2009) Classification of audio signals using SVM and RBFNN. Expert Syst Appl 36(3):6069–6075

    Article  Google Scholar 

  • Edelsbrunner H, Harer J (2010) Computational topology: an introduction. American Mathematical Society, Providence

    MATH  Google Scholar 

  • Emrani S, Gentimis T, Krim H (2015) Persistent homology of delay embeddings and its application to wheeze detection. IEEE Signal Process Lett 21(4):459–463

    Article  Google Scholar 

  • Fasy BT, Kim J, Lecci F, Maria C, Rouvreau V (2015) The included GUDHI is authored by Clement Maria PbUBMK Dionysus by Dmitriy Morozov, Reininghaus J Tda: statistical tools for topological data analysis r package version 1.4.1. https://CRAN.R-project.org/package=TDA

  • Garrett D, Peterson DA, Anderson CW, Thaut MH (2003) Comparison of linear, nonlinear, and feature selection methods for eeg signal classification. IEEE Trans Neural Syst Rehabil Eng 11:141–166

    Article  Google Scholar 

  • Hatcher A (2002) Algebraic topology. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Kerber M, Morozov D, Nigmetov A (2016) Geometry helps to compare persistence diagrams. In: Proceedings of the eighteenth workshop on algorithm engineering and experiments, pp 103–112

  • Krim H, Gentimis T, Chintakunta H (2016) Discovering the whole by the coarse: a topological paradigm for data analysis. IEEE Signal Process Mag 33(2):95–104

    Article  Google Scholar 

  • Kuhn HW (1955) The Hungarian method for the assignment problem. Naval Res Log Q 2:83–87

    Article  MathSciNet  Google Scholar 

  • Law K, Stewart A, Zygalakis K (2015) Data assimilation: a mathematical introduction. Springer, Berlin

    Book  Google Scholar 

  • Lum PY, Singh G, Lehman A, Ishkanov T, Vejdemo-Johansson M, Alagappan M, Carlsson J, Carlsson G (2013) Extracting insights from the shape of complex data using topology. Sci Rep 3(3):1236

    Article  Google Scholar 

  • Maroulas V, Nebenführ A (2015) Tracking rapid intracellular movements: a Bayesian random set approach. Ann Appl Stat 9(2):926–949

    Article  MathSciNet  Google Scholar 

  • Mileyko Y, Mukherjee S, Harer J (2011) Probability measures on the space of persistence diagrams. Inverse Problems 27(12):124007

    Article  MathSciNet  Google Scholar 

  • Nicolau M, Levine A, Carlsson G (2011) Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc Nat Acad Sci 108(17):7265–7270

    Article  Google Scholar 

  • Oppenheim AV, Schafer RW (2004) From frequency to quefrency: a history of the cepstrum. IEEE Signal Process Mag 21:95–106

    Article  Google Scholar 

  • Reininghaus J, Huber S, Bauer U, Kwitt R (2015) A stable multi-scale kernel for topological machine learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4741–4748

  • Robins V, Turner K (2016) Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. Physica D 334:99–117

    Article  MathSciNet  Google Scholar 

  • Schuhmacher D, Vo B, Vo B (2008) A consistent metric for performance evaluation of multi-object filters. IEEE Trans Signal Process 56:3447–3457

    Article  MathSciNet  Google Scholar 

  • Seversky LM, Davis S, Berger M (2016) On time-series topological data analysis: new data and opportunities. In: The IEEE conference on computer vision and pattern recognition, pp 59–67

  • Sherwin J, Sajda P (2013) Musical experts recruit action-related neural structures in harmonic anomaly detection: evidence for embodied cognition in expertise. Brain Cogn 83:190–202

    Article  Google Scholar 

  • Srinivas U, Nasrabadi NM, Monga V (2013) Graph-based multi-sensor fusion for acoustic signal classification. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 261–265

  • Takens F (1980) Detecting strange attractors in turbulence. In: Dynamical systems and turbulence, Warwick 1980. Lecture notes in mathematics, vol 898, pp 366–381

    Google Scholar 

  • Turner K, Mileyko Y, Mukherjee S, Harer J (2014) Fréchet means for distributions of persistence diagrams. Discrete Comput Geom 52(1):44–70

    Article  MathSciNet  Google Scholar 

  • Venkataraman V, Ramamurthy KN, Turaga P (2016) Persistent homology of attractors for action recognition. In: 2016 IEEE international conference on image processing (ICIP), pp 4150–4154

  • Xia K, Wei GW (2014) Persistent homology analysis of protein structure, flexibility, and folding. Int J Numer Methods Biomed Eng 30(8):814–844

    Article  MathSciNet  Google Scholar 

  • Zhang H, Nasrabadi NM, Huang TS, Zhang Y (2011) Transient acoustic signal classification using joint sparse representation. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2220–2223

Download references

Acknowledgements

VM would like to thank the Army Research Office and its support via the Grant \(\#\) W911NF-17-1-0313 to VM. Both authors would like to thank Dr. Tung-Duong Tran-Luu for providing the Army Research Lab’s acoustic signal dataset and for useful discussions. The authors would also like to thank five anonymous reviewers for their comments, which substantially improved the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vasileios Maroulas.

Appendices

Appendix A

Proof that \(d^c_p\) is a metric We adapt the proof from Schuhmacher et al. (2008) to the space \(P_W\). According to Definition 5, it is clear we have that \(d^c_p \ge 0\) and that \(d^c_p\) is symmetric and satisfies the identity. It remains to show the triangle inequality. We consider three persistence diagrams \(\mathbb {D}_1 = (t_1,\ldots ,t_\ell ), \mathbb {D}_2 = (u_1,\ldots ,u_n), \mathbb {D}_3 = (v_1,\ldots ,v_m)\). Assume that \(\ell \le n\) and that at most one of the cardinalities is zero. Since W is a closed and bounded subset of \(\mathbb {R}^2\), we consider some dummy points \((a_i)_{i \in \mathbb {N}}\) and \((b_i)_{i \in \mathbb {N}}\) at least distance c from W and each other. The two cases we must consider are \(\ell \le n \le m\) and \(\ell ,m \le n\).

We first treat the case when \(\ell \le n \le m\). Extend the persistence diagram \(\mathbb {D}_1\) with points \(t_{\ell + j} = a_j\) for \(1 \le j \le m - \ell \) and similarly for \(\mathbb {D}_2\) with \(u_{n + j} = b_j\) for \(1 \le j \le m-n\). This way the cardinality difference is equal to zero in Eq. (2). Moreover, after the dummy points have been added in, let \(\eta \) and \(\nu \) be the minimum permutations from \(\mathbb {D}_1\) to \(\mathbb {D}_3\) and from \(\mathbb {D}_3\) to \(\mathbb {D}_2\) respectively. Then, according to Eq. (2) and \(a \le c^pm\) implying \(\frac{a}{m} \le \frac{a + c^p(n-m)}{n}\), we have

$$\begin{aligned} d^c_p(\mathbb {D}_1,\mathbb {D}_2)= & {} \left( \frac{1}{n} \min _{\pi \in \varPi _n} \sum _{i=1}^n \min (c,||t_i - u_{\pi (i)}||_\infty )^p\right) ^\frac{1}{p}\nonumber \\\le & {} \left( \frac{1}{m} \min _{\pi \in \varPi _m} \sum _{i=1}^m \min (c,||t_i - u_{\pi (i)}||_\infty )^p\right) ^\frac{1}{p} \end{aligned}$$
(12)

The right hand side of Eq. (12) can further be bounded by

$$\begin{aligned}&\left( \frac{1}{m} \sum _{i=1}^m \min (c,||t_i - v_{\eta (i)}||_\infty )^p + \min (c,||v_i - u_{\nu (i)}||_\infty )^p\right) ^\frac{1}{p}\nonumber \\&\quad \le \left( \frac{1}{m} \sum _{i=1}^m \min (c,||t_i - v_{\eta (i)}||_\infty )^p\right) ^\frac{1}{p} + \left( \frac{1}{m} \sum _{i=1}^m \min (c,||v_i - u_{\nu (i)}||_\infty )^p\right) ^\frac{1}{p}\nonumber \\&\quad = d^c_p(\mathbb {D}_1,\mathbb {D}_3) + d^c_p(\mathbb {D}_3,\mathbb {D}_2) \end{aligned}$$
(13)

Note that in Eq. (13), we are mapping from \(\mathbb {D}_1\) to \(\mathbb {D}_3\) in the most optimal way (via permutation \(\eta \)) and then from \(\mathbb {D}_3\) to \(\mathbb {D}_2\) in the most optimal way (via permutation \(\nu \)).

The second case is when \(\ell ,m \le n\). Take \(\eta \) and \(\nu \) to be the minimum permutations from \(\mathbb {D}_1\) to \(\mathbb {D}_3\) and from \(\mathbb {D}_3\) to \(\mathbb {D}_2\) respectively as above. Then, similarly, we have that

$$\begin{aligned} d^c_p(\mathbb {D}_1,\mathbb {D}_2)= & {} \left( \frac{1}{n} \min _{\pi \in \varPi _n} \sum _{i=1}^n \min (c,||t_i - u_{\pi (i)}||_\infty )^p\right) ^\frac{1}{p}\nonumber \\\le & {} \left( \frac{1}{m} \sum _{i=1}^m \min (c,||t_i - v_{\eta (i)}||_\infty )^p + \min (c,||v_i - u_{\nu (i)}||_\infty )^p\right) ^\frac{1}{p}\nonumber \\\le & {} \left( \frac{1}{m} \sum _{i=1}^m \min (c,||t_i - v_{\eta (i)}||_\infty )^p\right) ^\frac{1}{p}\nonumber \\&+\, \left( \frac{1}{m} \sum _{i=1}^m \min (c,||v_i - u_{\nu (i)}||_\infty )^p\right) ^\frac{1}{p}\nonumber \\= & {} d^c_p(\mathbb {D}_1,\mathbb {D}_3) + d^c_p(\mathbb {D}_3,\mathbb {D}_2) \end{aligned}$$

Appendix B

Proof of Lemma 1

By “Appendix A”, it is clear that this is a metric space. We first show completeness. Let \(\{\mathbb {D}_n\}_{i=1}^k\) be a Cauchy sequence of persistence diagrams. It is clear that for some \(k_0\), we have that \(j,l \ge k_0\) implies \(|\mathbb {D}_j| = |\mathbb {D}_l| = k\), so we may assume without loss of generality that the associated cardinalities are equal. Fix an \(\epsilon > 0\). Note there is N such that for \(n,m > N\), \(d_p^c(\mathbb {D}_n,\mathbb {D}_m) < \epsilon \). In particular, since their cardinalities are the same, we have that

$$\begin{aligned} d^c_p(\mathbb {D}_n,\mathbb {D}_m) = \left( \frac{1}{k}\min _{\pi \in \varPi _k} \sum _{i=1}^k \left\| x^n_i - x^m_{\pi (i)}\right\| _\infty ^p\right) ^\frac{1}{p} < \epsilon \end{aligned}$$

and so we have that, for a given point \(x^n_i \in \mathbb {D}_n\),

$$\begin{aligned} \left\| x^n_i - x^m_{\pi (i)}\right\| _\infty < (k)^\frac{1}{p}\epsilon \end{aligned}$$

where \({\pi (i)}\) is the minimal permutation.

Thus, there is a sequence of points \(x^n_i, x^{n+1}_{\pi _{n+1}(i)}, x^{n+2}_{\pi _{n+2}(i)},\ldots \) such that the distance between any two points in this sequence is less than \(2(k)^\frac{1}{p}\epsilon \) via the triangle inequality, where \(\pi _{n+\alpha }\) is the minimal permutation between persistence diagrams \(\mathbb {D}_n\) and \(\mathbb {D}_{n+\alpha }\). This is a Cauchy sequence in W under the \(\inf \)-norm. Since W is complete, this sequence converges to some limit \(x_i \in S\). Repeating this for each element in \(\mathbb {D}_n\), we generate a persistence diagram \(\mathbb {D}^*\) consisting of points \((x_1,\ldots ,x_k)\) chosen as the limits above.

Therefore, for any fixed \(\epsilon ^p\), since each sequence above converges to the corresponding limits, there is some N such that for \(j > N\) we have \(||x^{j}_{i} - x_i||_\infty < \epsilon \) This implies that

$$\begin{aligned} d_p^c(\mathbb {D}_j,\mathbb {D}^*)= & {} \left( \frac{1}{k}\min _{\pi \in \varPi _k} \sum _{i=1}^k ||x^n_i - x_{\pi (i)}||_\infty ^p\right) ^\frac{1}{p} \le \left( \frac{1}{k} \sum _{i=1}^k ||x^n_i - x_i||_\infty ^p\right) ^\frac{1}{p} \\< & {} \left( \frac{1}{k} k\epsilon ^p\right) ^\frac{1}{p} = \epsilon \end{aligned}$$

Since this sequence converges to a limit in this space, this space is complete.

Finally, it remains to show separability. Consider the space \(P_{\mathbb {Q} \bigcap W,k}\) of all persistence diagrams with points in \(\mathbb {Q} \bigcap W\) and cardinality less than or equal to k. Then for any persistence diagram \(\mathbb {D}\), find \(\mathbb {D}_q \in P_{\mathbb {Q} \bigcap W,k}\) such that \(|\mathbb {D}| = |\mathbb {D}_q| = k\) and for all \(x_i \in \mathbb {D}\), there is a corresponding \(y_{x_i} \in \mathbb {D}_q\) such that \(||x_i - y_{x_i}||_\infty ^p \le \epsilon \). Then

$$\begin{aligned} d^c_p(\mathbb {D},\mathbb {D}_q) = \frac{1}{k} \left( \min _{\pi \in \varPi _k} \sum _{i=1}^k ||x_i - y_{\pi (i)}||_\infty ^p\right) \le \frac{1}{k} \sum _{i=1}^k ||x_i - y_{x_i}||_\infty ^p) \le \frac{1}{k} \sum _{i=1}^k \epsilon = \epsilon \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marchese, A., Maroulas, V. Signal classification with a point process distance on the space of persistence diagrams. Adv Data Anal Classif 12, 657–682 (2018). https://doi.org/10.1007/s11634-017-0294-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-017-0294-x

Keywords

Mathematics Subject Classification

Navigation