Detecting Smooth Cluster Changes in Evolving Graph Structures

Okui, Sohei; Osamura, Kaho; Inokuchi, Akihiro

doi:10.1007/978-3-319-89803-2_10

Sohei Okui³,
Kaho Osamura⁴ &
Akihiro Inokuchi⁴

Part of the book series: Studies in Big Data ((SBD,volume 41))

1055 Accesses

Abstract

Graph mining is a set of techniques for finding useful patterns in various types of structured data. Many effective algorithms for mining static graphs have been proposed. However, graphs of human relationships and evolving genes change over time, and such evolving graphs require different algorithms for analysis. In this chapter, we explain a method called O2I for clustering in evolving graphs that can detect changes in clusters over time. O2I partitions the graph sequence into smooth clusters, even when the numbers of clusters and vertices vary. It first constructs a graph from the graph sequence, then uses spectral clustering and the RatioCut to apply k partitioning to this graph. O2I is compared in detail with the preserving clustering membership (PCM) algorithm, which is a conventional online graph-sequence clustering algorithm in which the numbers of clusters and vertices must remain constant. We further show that, in contrast to PCM, the performance of O2I is not dependent on the clustering of the initial graph in the graph sequence. Experiments on synthetic evolving graphs show that O2I is practical to calculate and addresses the main disadvantages of PCM. Further tests on real-world data show that O2I can obtain reasonable clusters. This method is hence a flexible clustering solution and will be useful on a wide range of graph-mining applications in which the connections, number of clusters, and number of vertices of the graphs evolve over time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The weight 0 means that there is no connection between two vertices. We need these zeros to create Laplacian matrices in Sect. 2.2.
2.
Because of space limitations, we omit X^T X = I henceforth.
3.
The numbers of vertices in the third detected cluster sequence increases with time as 〈0, 1, 4, 8, 13, 16, 16, 18, 18, 18〉.
4.
Vector y_i is the same symbol used in Algorithm 2.

References

Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: A framework for clustering evolving data streams. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 81–92 (2003)
Google Scholar
Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: A framework for projected clustering of high dimensional data streams. In: Proc. of International Conference on Very Large Data Bases (VLDB), pp. 852–863 (2004)
Google Scholar
Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: On demand classification of data streams. In: Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 503–508 (2004)
Google Scholar
Bar-Joseph, Z., Gerber, G.K., Gifford, D.K., Jaakkola, T.S., Simon, I.: A new approach to analyzing gene expression time series data. In: Proceedings of International Conference on Computational Biology (RECOMB), pp. 39–48 (2002)
Google Scholar
Beringer, J., Hüllermeier, E.: Online clustering of parallel data streams. Data Knowl. Eng. 58(2), 180–204 (2006)
Article Google Scholar
Berlingerio, M., Bonchi, F., Bringmann, B., Gionis, A.: Mining graph evolution rules. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), pp. 115–130 (2009)
Google Scholar
Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 328–339 (2006)
Google Scholar
Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 554–560 (2006)
Google Scholar
Charikar, M., O’Callaghan, L., Panigrahy, R.: Better streaming algorithms for clustering problems. In: Proceedings of Annual ACM Symposium on Theory of Computing (STOC), pp. 30–39 (2003)
Google Scholar
Chi, Y., Song, X., Zhou, D., Hino, K., Tseng, B.L.: On evolutionary spectral clustering. ACM Trans. Knowl. Discov. Data 3(4), 17:1–17:30 (2009)
Google Scholar
Domingos, P.M., Hulten, G.: A general method for scaling up machine learning algorithms and its application to clustering. In: Proceedings of International Conference on Machine Learning (ICML), pp. 106–113 (2001)
Google Scholar
Inokuchi, A., Washio, I.: Mining frequent graph sequence patterns induced by vertices. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 466–477 (2010)
Google Scholar
Klimmt, B., Yang, Y.: Introducing the Enron corpus. In: CEAS Conference (2004)
Google Scholar
Möller-Levet, C.S., Klawonn, F., Cho, K.-H., Yin, H., Wolkenhauer, O.: Clustering of unevenly sampled gene expression time-series data. Fuzzy Sets Syst. 152(1), 49–66 (2005)
Article MathSciNet Google Scholar
O’Callaghan, L., Meyerson, A., Motwani, R., Mishra, N., Guha, S.: Streaming-data algorithms for high-quality clustering. In: Proceedings of International Conference on Data Engineering (ICDE), pp. 685–694 (2002)
Google Scholar
Okui, S., Osamura, K., Inokuchi, A.: Detecting smooth cluster changes in evolving graphs. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA), pp. 369–374 (2016)
Google Scholar
van Wijk, J.J., van Selow, E.R.: Cluster and calendar based visualization of time series data. In: Proceedings of IEEE Symposium on Information Visualization (INFOVIS), pp. 4–9 (1999)
Google Scholar
von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Article MathSciNet Google Scholar
Wang, Y., Liu, S.-X., Feng, J., Zhou, L.: Mining naturally smooth evolution of clusters from dynamic data. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 125–134 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Science and Technology, Kwansei Gakuin University, Sanda, Japan
Sohei Okui
School of Science and Technology, Kwansei Gakuin University, Sanda, Japan
Kaho Osamura & Akihiro Inokuchi

Authors

Sohei Okui
View author publications
You can also search for this author in PubMed Google Scholar
Kaho Osamura
View author publications
You can also search for this author in PubMed Google Scholar
Akihiro Inokuchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akihiro Inokuchi .

Editor information

Editors and Affiliations

Institute Mines-Telecom Lille Douai, Douai, France
Moamar Sayed-Mouchaweh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Okui, S., Osamura, K., Inokuchi, A. (2019). Detecting Smooth Cluster Changes in Evolving Graph Structures. In: Sayed-Mouchaweh, M. (eds) Learning from Data Streams in Evolving Environments. Studies in Big Data, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-319-89803-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-89803-2_10
Published: 29 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89802-5
Online ISBN: 978-3-319-89803-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics