Skip to main content

Z-Embedding: A Spectral Representation of Event Intervals for Efficient Clustering and Classification

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Abstract

Sequences of event intervals occur in several application domains, while their inherent complexity hinders scalable solutions to tasks such as clustering and classification. In this paper, we propose a novel spectral embedding representation of event interval sequences that relies on bipartite graphs. More concretely, each event interval sequence is represented by a bipartite graph by following three main steps: (1) creating a hash table that can quickly convert a collection of event interval sequences into a bipartite graph representation, (2) creating and regularizing a bi-adjacency matrix corresponding to the bipartite graph, (3) defining a spectral embedding mapping on the bi-adjacency matrix. In addition, we show that substantial improvements can be achieved with regard to classification performance through pruning parameters that capture the nature of the relations formed by the event intervals. We demonstrate through extensive experimental evaluation on five real-world datasets that our approach can obtain runtime speedups of up to two orders of magnitude compared to other state-of-the-art methods and similar or better clustering and classification performance.

This work was partly supported by the VR-2016-03372 Swedish Research Council Starting Grant, as well as the EXTREME project funded by the Digital Futures framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/zedshape/zembedding.

  2. 2.

    This is even an underestimate as for the cases where competitors that did not finish within the one-hour execution time limit, our approach is at least 300 times faster.

References

  1. Allen, J.F.: Maintaining knowledge about temporal intervals. CACM 26(11), 832–843 (1983)

    Article  Google Scholar 

  2. Bornemann, L., Lecerf, J., Papapetrou, P.: STIFE: a framework for feature-based classification of sequences of temporal intervals. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS (LNAI), vol. 9956, pp. 85–100. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46307-0_6

    Chapter  Google Scholar 

  3. Chaudhuri, K., Chung, F., Tsiatas, A.: Spectral clustering of graphs with general degrees in the extended planted partition model. In: COLT, p. 35-1 (2012)

    Google Scholar 

  4. De Lara, N., Bonald, T.: Spectral embedding of regularized block models. In: ICLR (2020)

    Google Scholar 

  5. Joseph, A., Yu, B., et al.: Impact of regularization on spectral clustering. Ann. Stat. 44(4), 1765–1791 (2016)

    Article  MathSciNet  Google Scholar 

  6. Kostakis, O., Gionis, A.: On mining temporal patterns in dynamic graphs, and other unrelated problems. In: Cherifi, C., Cherifi, H., Karsai, M., Musolesi, M. (eds.) COMPLEX NETWORKS 2017 2017. SCI, vol. 689, pp. 516–527. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-72150-7_42

    Chapter  Google Scholar 

  7. Kostakis, O., Papapetrou, P.: On searching and indexing sequences of temporal intervals. Data Min. Knowl. Disc. 31(3), 809–850 (2017). https://doi.org/10.1007/s10618-016-0489-3

    Article  MathSciNet  MATH  Google Scholar 

  8. Kotsifakos, A., Papapetrou, P., Athitsos, V.: IBSM: interval-based sequence matching. In: SDM, pp. 596–604. SIAM (2013)

    Google Scholar 

  9. Kunegis, J.: Exploiting the structure of bipartite graphs for algebraic and spectral graph theory applications. Internet Math. 11(3), 201–321 (2015)

    Article  MathSciNet  Google Scholar 

  10. Lam, H.T., Mörchen, F., Fradkin, D., Calders, T.: Mining compressing sequential patterns. SADM 7(1), 34–52 (2014)

    MathSciNet  MATH  Google Scholar 

  11. Liu, L., Wang, S., Hu, B., Qiong, Q., Wen, J., Rosenblum, D.S.: Learning structures of interval-based Bayesian networks in probabilistic generative model for human complex activity recognition. PR 81, 545–561 (2018)

    Google Scholar 

  12. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: NIPS, pp. 849–856 (2002)

    Google Scholar 

  13. Perera, D., Kay, J., Koprinska, I., Yacef, K., Zaïane, O.R.: Clustering and sequential pattern mining of online collaborative learning data. TKDE 21(6), 759–772 (2008)

    Google Scholar 

  14. Pissinou, N., Radev, I., Makki, K.: Spatio-temporal modeling in video and multimedia geographic information systems. GeoInformatica 5(4), 375–409 (2001)

    Article  Google Scholar 

  15. Qin, T., Rohe, K.: Regularized spectral clustering under the degree-corrected stochastic blockmodel. In: NIPS, pp. 3120–3128 (2013)

    Google Scholar 

  16. Ramasamy, D., Madhow, U.: Compressive spectral embedding: sidestepping the SVD. In: NIPS, pp. 550–558 (2015)

    Google Scholar 

  17. Schmidt, M., Palm, G., Schwenker, F.: Spectral graph features for the classification of graphs and graph sequences. CompStat 29(1–2), 65–80 (2014)

    Google Scholar 

  18. Sheetrit, E., Nissim, N., Klimov, D., Shahar, Y.: Temporal probabilistic profiles for sepsis prediction in the ICU. In: KDD, pp. 2961–2969 (2019)

    Google Scholar 

  19. Shi, J., Malik, J.: Normalized cuts and image segmentation. TPAMI 22(8), 888–905 (2000)

    Article  Google Scholar 

  20. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  21. Von Luxburg, U., Belkin, M., Bousquet, O.: Consistency of spectral clustering. ANN STAT 36, 555–586 (2008)

    Google Scholar 

  22. Wang, J., Han, J.: Bide: efficient mining of frequent closed sequences. In: ICDE, p. 79. IEEE (2004)

    Google Scholar 

  23. Zhang, Y., Rohe, K.: Understanding regularized spectral clustering via graph conductance. In: NIPS, pp. 10631–10640 (2018)

    Google Scholar 

  24. Zhou, Z., Amini, A.A.: Analysis of spectral clustering algorithms for community detection: the general bipartite setting. JMLR 20(47), 1–47 (2019)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zed Lee .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 90 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lee, Z., Girdzijauskas, Š., Papapetrou, P. (2021). Z-Embedding: A Spectral Representation of Event Intervals for Efficient Clustering and Classification. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12457. Springer, Cham. https://doi.org/10.1007/978-3-030-67658-2_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67658-2_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67657-5

  • Online ISBN: 978-3-030-67658-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics