Abstract
In this paper, we consider the problem of signal classification. First, the signal is translated into a persistence diagram through the use of delay-embedding and persistent homology. Endowing the data space of persistence diagrams with a metric from point processes, we show that it admits statistical structure in the form of Fréchet means and variances and a classification scheme is established. In contrast with the Wasserstein distance, this metric accounts for changes in small persistence and changes in cardinality. The classification results using this distance are benchmarked on both synthetic data and real acoustic signals and it is demonstrated that this classifier outperforms current signal classification techniques.
Similar content being viewed by others
References
Adcock A, Carlsson E, Carlsson G (2016) The ring of algebraic functions on persistence bar codes. Homol Homotopy Appl 18(1):381–402
Adler RJ, Bobrowski O, Weinberger S (2014) Crackle: the homology of noise. Discrete Comput Geom 52(4):680–704
Azimi-Sadjadi MR, Yang Y, Srinivasan S (2007) Acoustic classification of battlefield transient events using wavelet subband features. In: Proceedings of SPIE defense and security symposium, p 6562
Bampasidou M, Gentimis T (2014) Modeling collaborations with persistent homology. arXiv preprint arXiv:1403.5346
Bauer U (2015) Ripser. https://github.com/Ripser/ripser
Bogert BP, Healy MJ, Tukey JW (1963) The quefrency alanysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking. In: Proceedings of the symposium on time series analysis, chapter, vol 15, pp 209–243
Bubenik P (2015) Statistical topological data analysis using persistence landscapes. J Mach Learn Res 16(1):77–102
Carlsson G (2009) Topology and data. Bull Am Math Soc 46(2):255–308
Chazal F, Cohen-Steiner D, Glisse M, Guibas LJ, Oudot SY (2009) Proximity of persistence modules and their diagrams. In: Proceedings of the twenty-fifth annual symposium on Computational geometry. ACM, pp 237–246
Cohen-Steiner D, Edelsbrunner H, Harer J, Mileyko Y (2010) Lipschitz functions have \(L_p\)-stable persistence. Found Comput Math 10(2):127–139
Dhanalakshmi P, Palanivel S, Ramalingam V (2009) Classification of audio signals using SVM and RBFNN. Expert Syst Appl 36(3):6069–6075
Edelsbrunner H, Harer J (2010) Computational topology: an introduction. American Mathematical Society, Providence
Emrani S, Gentimis T, Krim H (2015) Persistent homology of delay embeddings and its application to wheeze detection. IEEE Signal Process Lett 21(4):459–463
Fasy BT, Kim J, Lecci F, Maria C, Rouvreau V (2015) The included GUDHI is authored by Clement Maria PbUBMK Dionysus by Dmitriy Morozov, Reininghaus J Tda: statistical tools for topological data analysis r package version 1.4.1. https://CRAN.R-project.org/package=TDA
Garrett D, Peterson DA, Anderson CW, Thaut MH (2003) Comparison of linear, nonlinear, and feature selection methods for eeg signal classification. IEEE Trans Neural Syst Rehabil Eng 11:141–166
Hatcher A (2002) Algebraic topology. Cambridge University Press, Cambridge
Kerber M, Morozov D, Nigmetov A (2016) Geometry helps to compare persistence diagrams. In: Proceedings of the eighteenth workshop on algorithm engineering and experiments, pp 103–112
Krim H, Gentimis T, Chintakunta H (2016) Discovering the whole by the coarse: a topological paradigm for data analysis. IEEE Signal Process Mag 33(2):95–104
Kuhn HW (1955) The Hungarian method for the assignment problem. Naval Res Log Q 2:83–87
Law K, Stewart A, Zygalakis K (2015) Data assimilation: a mathematical introduction. Springer, Berlin
Lum PY, Singh G, Lehman A, Ishkanov T, Vejdemo-Johansson M, Alagappan M, Carlsson J, Carlsson G (2013) Extracting insights from the shape of complex data using topology. Sci Rep 3(3):1236
Maroulas V, Nebenführ A (2015) Tracking rapid intracellular movements: a Bayesian random set approach. Ann Appl Stat 9(2):926–949
Mileyko Y, Mukherjee S, Harer J (2011) Probability measures on the space of persistence diagrams. Inverse Problems 27(12):124007
Nicolau M, Levine A, Carlsson G (2011) Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc Nat Acad Sci 108(17):7265–7270
Oppenheim AV, Schafer RW (2004) From frequency to quefrency: a history of the cepstrum. IEEE Signal Process Mag 21:95–106
Reininghaus J, Huber S, Bauer U, Kwitt R (2015) A stable multi-scale kernel for topological machine learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4741–4748
Robins V, Turner K (2016) Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. Physica D 334:99–117
Schuhmacher D, Vo B, Vo B (2008) A consistent metric for performance evaluation of multi-object filters. IEEE Trans Signal Process 56:3447–3457
Seversky LM, Davis S, Berger M (2016) On time-series topological data analysis: new data and opportunities. In: The IEEE conference on computer vision and pattern recognition, pp 59–67
Sherwin J, Sajda P (2013) Musical experts recruit action-related neural structures in harmonic anomaly detection: evidence for embodied cognition in expertise. Brain Cogn 83:190–202
Srinivas U, Nasrabadi NM, Monga V (2013) Graph-based multi-sensor fusion for acoustic signal classification. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 261–265
Takens F (1980) Detecting strange attractors in turbulence. In: Dynamical systems and turbulence, Warwick 1980. Lecture notes in mathematics, vol 898, pp 366–381
Turner K, Mileyko Y, Mukherjee S, Harer J (2014) Fréchet means for distributions of persistence diagrams. Discrete Comput Geom 52(1):44–70
Venkataraman V, Ramamurthy KN, Turaga P (2016) Persistent homology of attractors for action recognition. In: 2016 IEEE international conference on image processing (ICIP), pp 4150–4154
Xia K, Wei GW (2014) Persistent homology analysis of protein structure, flexibility, and folding. Int J Numer Methods Biomed Eng 30(8):814–844
Zhang H, Nasrabadi NM, Huang TS, Zhang Y (2011) Transient acoustic signal classification using joint sparse representation. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2220–2223
Acknowledgements
VM would like to thank the Army Research Office and its support via the Grant \(\#\) W911NF-17-1-0313 to VM. Both authors would like to thank Dr. Tung-Duong Tran-Luu for providing the Army Research Lab’s acoustic signal dataset and for useful discussions. The authors would also like to thank five anonymous reviewers for their comments, which substantially improved the manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
Proof that \(d^c_p\) is a metric We adapt the proof from Schuhmacher et al. (2008) to the space \(P_W\). According to Definition 5, it is clear we have that \(d^c_p \ge 0\) and that \(d^c_p\) is symmetric and satisfies the identity. It remains to show the triangle inequality. We consider three persistence diagrams \(\mathbb {D}_1 = (t_1,\ldots ,t_\ell ), \mathbb {D}_2 = (u_1,\ldots ,u_n), \mathbb {D}_3 = (v_1,\ldots ,v_m)\). Assume that \(\ell \le n\) and that at most one of the cardinalities is zero. Since W is a closed and bounded subset of \(\mathbb {R}^2\), we consider some dummy points \((a_i)_{i \in \mathbb {N}}\) and \((b_i)_{i \in \mathbb {N}}\) at least distance c from W and each other. The two cases we must consider are \(\ell \le n \le m\) and \(\ell ,m \le n\).
We first treat the case when \(\ell \le n \le m\). Extend the persistence diagram \(\mathbb {D}_1\) with points \(t_{\ell + j} = a_j\) for \(1 \le j \le m - \ell \) and similarly for \(\mathbb {D}_2\) with \(u_{n + j} = b_j\) for \(1 \le j \le m-n\). This way the cardinality difference is equal to zero in Eq. (2). Moreover, after the dummy points have been added in, let \(\eta \) and \(\nu \) be the minimum permutations from \(\mathbb {D}_1\) to \(\mathbb {D}_3\) and from \(\mathbb {D}_3\) to \(\mathbb {D}_2\) respectively. Then, according to Eq. (2) and \(a \le c^pm\) implying \(\frac{a}{m} \le \frac{a + c^p(n-m)}{n}\), we have
The right hand side of Eq. (12) can further be bounded by
Note that in Eq. (13), we are mapping from \(\mathbb {D}_1\) to \(\mathbb {D}_3\) in the most optimal way (via permutation \(\eta \)) and then from \(\mathbb {D}_3\) to \(\mathbb {D}_2\) in the most optimal way (via permutation \(\nu \)).
The second case is when \(\ell ,m \le n\). Take \(\eta \) and \(\nu \) to be the minimum permutations from \(\mathbb {D}_1\) to \(\mathbb {D}_3\) and from \(\mathbb {D}_3\) to \(\mathbb {D}_2\) respectively as above. Then, similarly, we have that
Appendix B
Proof of Lemma 1
By “Appendix A”, it is clear that this is a metric space. We first show completeness. Let \(\{\mathbb {D}_n\}_{i=1}^k\) be a Cauchy sequence of persistence diagrams. It is clear that for some \(k_0\), we have that \(j,l \ge k_0\) implies \(|\mathbb {D}_j| = |\mathbb {D}_l| = k\), so we may assume without loss of generality that the associated cardinalities are equal. Fix an \(\epsilon > 0\). Note there is N such that for \(n,m > N\), \(d_p^c(\mathbb {D}_n,\mathbb {D}_m) < \epsilon \). In particular, since their cardinalities are the same, we have that
and so we have that, for a given point \(x^n_i \in \mathbb {D}_n\),
where \({\pi (i)}\) is the minimal permutation.
Thus, there is a sequence of points \(x^n_i, x^{n+1}_{\pi _{n+1}(i)}, x^{n+2}_{\pi _{n+2}(i)},\ldots \) such that the distance between any two points in this sequence is less than \(2(k)^\frac{1}{p}\epsilon \) via the triangle inequality, where \(\pi _{n+\alpha }\) is the minimal permutation between persistence diagrams \(\mathbb {D}_n\) and \(\mathbb {D}_{n+\alpha }\). This is a Cauchy sequence in W under the \(\inf \)-norm. Since W is complete, this sequence converges to some limit \(x_i \in S\). Repeating this for each element in \(\mathbb {D}_n\), we generate a persistence diagram \(\mathbb {D}^*\) consisting of points \((x_1,\ldots ,x_k)\) chosen as the limits above.
Therefore, for any fixed \(\epsilon ^p\), since each sequence above converges to the corresponding limits, there is some N such that for \(j > N\) we have \(||x^{j}_{i} - x_i||_\infty < \epsilon \) This implies that
Since this sequence converges to a limit in this space, this space is complete.
Finally, it remains to show separability. Consider the space \(P_{\mathbb {Q} \bigcap W,k}\) of all persistence diagrams with points in \(\mathbb {Q} \bigcap W\) and cardinality less than or equal to k. Then for any persistence diagram \(\mathbb {D}\), find \(\mathbb {D}_q \in P_{\mathbb {Q} \bigcap W,k}\) such that \(|\mathbb {D}| = |\mathbb {D}_q| = k\) and for all \(x_i \in \mathbb {D}\), there is a corresponding \(y_{x_i} \in \mathbb {D}_q\) such that \(||x_i - y_{x_i}||_\infty ^p \le \epsilon \). Then
\(\square \)
Rights and permissions
About this article
Cite this article
Marchese, A., Maroulas, V. Signal classification with a point process distance on the space of persistence diagrams. Adv Data Anal Classif 12, 657–682 (2018). https://doi.org/10.1007/s11634-017-0294-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-017-0294-x
Keywords
- Classification of time series
- Data space of persistence diagrams
- Wasserstein metric
- Cardinality
- Persistent homology