Skip to main content

Detecting Smooth Cluster Changes in Evolving Graph Structures

  • Chapter
  • First Online:
Book cover Learning from Data Streams in Evolving Environments

Part of the book series: Studies in Big Data ((SBD,volume 41))

  • 1055 Accesses

Abstract

Graph mining is a set of techniques for finding useful patterns in various types of structured data. Many effective algorithms for mining static graphs have been proposed. However, graphs of human relationships and evolving genes change over time, and such evolving graphs require different algorithms for analysis. In this chapter, we explain a method called O2I for clustering in evolving graphs that can detect changes in clusters over time. O2I partitions the graph sequence into smooth clusters, even when the numbers of clusters and vertices vary. It first constructs a graph from the graph sequence, then uses spectral clustering and the RatioCut to apply k partitioning to this graph. O2I is compared in detail with the preserving clustering membership (PCM) algorithm, which is a conventional online graph-sequence clustering algorithm in which the numbers of clusters and vertices must remain constant. We further show that, in contrast to PCM, the performance of O2I is not dependent on the clustering of the initial graph in the graph sequence. Experiments on synthetic evolving graphs show that O2I is practical to calculate and addresses the main disadvantages of PCM. Further tests on real-world data show that O2I can obtain reasonable clusters. This method is hence a flexible clustering solution and will be useful on a wide range of graph-mining applications in which the connections, number of clusters, and number of vertices of the graphs evolve over time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The weight 0 means that there is no connection between two vertices. We need these zeros to create Laplacian matrices in Sect. 2.2.

  2. 2.

    Because of space limitations, we omit XT X = I henceforth.

  3. 3.

    The numbers of vertices in the third detected cluster sequence increases with time as 〈0,  1,  4,  8,  13,  16,  16,  18,  18,  18〉.

  4. 4.

    Vector yi is the same symbol used in Algorithm 2.

References

  1. Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: A framework for clustering evolving data streams. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 81–92 (2003)

    Google Scholar 

  2. Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: A framework for projected clustering of high dimensional data streams. In: Proc. of International Conference on Very Large Data Bases (VLDB), pp. 852–863 (2004)

    Google Scholar 

  3. Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: On demand classification of data streams. In: Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 503–508 (2004)

    Google Scholar 

  4. Bar-Joseph, Z., Gerber, G.K., Gifford, D.K., Jaakkola, T.S., Simon, I.: A new approach to analyzing gene expression time series data. In: Proceedings of International Conference on Computational Biology (RECOMB), pp. 39–48 (2002)

    Google Scholar 

  5. Beringer, J., Hüllermeier, E.: Online clustering of parallel data streams. Data Knowl. Eng. 58(2), 180–204 (2006)

    Article  Google Scholar 

  6. Berlingerio, M., Bonchi, F., Bringmann, B., Gionis, A.: Mining graph evolution rules. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), pp. 115–130 (2009)

    Google Scholar 

  7. Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 328–339 (2006)

    Google Scholar 

  8. Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 554–560 (2006)

    Google Scholar 

  9. Charikar, M., O’Callaghan, L., Panigrahy, R.: Better streaming algorithms for clustering problems. In: Proceedings of Annual ACM Symposium on Theory of Computing (STOC), pp. 30–39 (2003)

    Google Scholar 

  10. Chi, Y., Song, X., Zhou, D., Hino, K., Tseng, B.L.: On evolutionary spectral clustering. ACM Trans. Knowl. Discov. Data 3(4), 17:1–17:30 (2009)

    Google Scholar 

  11. Domingos, P.M., Hulten, G.: A general method for scaling up machine learning algorithms and its application to clustering. In: Proceedings of International Conference on Machine Learning (ICML), pp. 106–113 (2001)

    Google Scholar 

  12. Inokuchi, A., Washio, I.: Mining frequent graph sequence patterns induced by vertices. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 466–477 (2010)

    Google Scholar 

  13. Klimmt, B., Yang, Y.: Introducing the Enron corpus. In: CEAS Conference (2004)

    Google Scholar 

  14. Möller-Levet, C.S., Klawonn, F., Cho, K.-H., Yin, H., Wolkenhauer, O.: Clustering of unevenly sampled gene expression time-series data. Fuzzy Sets Syst. 152(1), 49–66 (2005)

    Article  MathSciNet  Google Scholar 

  15. O’Callaghan, L., Meyerson, A., Motwani, R., Mishra, N., Guha, S.: Streaming-data algorithms for high-quality clustering. In: Proceedings of International Conference on Data Engineering (ICDE), pp. 685–694 (2002)

    Google Scholar 

  16. Okui, S., Osamura, K., Inokuchi, A.: Detecting smooth cluster changes in evolving graphs. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA), pp. 369–374 (2016)

    Google Scholar 

  17. van Wijk, J.J., van Selow, E.R.: Cluster and calendar based visualization of time series data. In: Proceedings of IEEE Symposium on Information Visualization (INFOVIS), pp. 4–9 (1999)

    Google Scholar 

  18. von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  19. Wang, Y., Liu, S.-X., Feng, J., Zhou, L.: Mining naturally smooth evolution of clusters from dynamic data. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 125–134 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akihiro Inokuchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Okui, S., Osamura, K., Inokuchi, A. (2019). Detecting Smooth Cluster Changes in Evolving Graph Structures. In: Sayed-Mouchaweh, M. (eds) Learning from Data Streams in Evolving Environments. Studies in Big Data, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-319-89803-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-89803-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-89802-5

  • Online ISBN: 978-3-319-89803-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics