Discovering patterns in time-varying graphs: a triclustering approach

Regular Article

Abstract

This paper introduces a novel technique to track structures in time varying graphs. The method uses a maximum a posteriori approach for adjusting a three-dimensional co-clustering of the source vertices, the destination vertices and the time, to the data under study, in a way that does not require any hyper-parameter tuning. The three dimensions are simultaneously segmented in order to build clusters of source vertices, destination vertices and time segments where the edge distributions across clusters of vertices follow the same evolution over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make any a priori quantization. Experiments conducted on artificial data illustrate the good behavior of the technique, and a study of a real-life data set shows the potential of the proposed approach for exploratory data analysis.

Keywords

Co-clustering Time-varying graph Graph mining Model selection 

Mathematics Subject Classification

62P25 62H30 62G07 62B10 62F15 

Notes

Acknowledgments

The authors thank the anonymous reviewers and the associate editor for their valuable comments that helped improving this paper.

References

  1. Bekkerman R, El-Yaniv R, McCallum A (2005) Multi-way distributional clustering via pairwise interractions. In: ICML, pp 41–48Google Scholar
  2. Borgatti SP (1988) A comment on Doreian’s regular equivalence in symmetric structures. Soc Netw 10:265–271MathSciNetCrossRefGoogle Scholar
  3. Boullé M (2011) Data grid models for preparation and modeling in supervised learning. In: Guyon I, Cawley G, Dror G, Saffari A (eds) Hands-on pattern recognition: challenges in machine learning, vol 1. Microtome Publishing, pp 99–130Google Scholar
  4. Casteigts A, Flocchini P, Quattrociocchi W, Santoro N (2012) Time-varying graphs and dynamic networks. Int J Parallel Emerg Distrib Syst 27(5):387–408. doi: 10.1080/17445760.2012.668546 CrossRefGoogle Scholar
  5. Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, New YorkMATHGoogle Scholar
  6. Dhillon IS, Mallela S, Modha D (2003) Information-theoretic co-clustering. In: KDD ’03, pp 89–98Google Scholar
  7. Erdős P, Rényi A (1959) On random graphs. I. Publ Math 6:290–297Google Scholar
  8. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3):75–174MathSciNetCrossRefGoogle Scholar
  9. Goldenberg A, Zheng AX, Fienberg S, Airoldi EM (2009) A survey of statistical network models. Found Trends Mach Learn 2(2):129–233MATHCrossRefGoogle Scholar
  10. Grünwald P (2007) The minimum description length principle. Mit Press, CambridgeGoogle Scholar
  11. Guigourès R, Boullé M, Rossi F (2012) A triclustering approach for time evolving graphs. In: Co-clustering and applications, IEEE 12th international conference on data mining workshops (ICDMW 2012), Brussels, Belgium, pp 115–122. doi: 10.1109/ICDMW.2012.61
  12. Hansen P, Mladenovic N (2001) Variable neighborhood search: principles and applications. Eur J Oper Res 130(3):449–467MATHMathSciNetCrossRefGoogle Scholar
  13. Hartigan J (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129CrossRefGoogle Scholar
  14. Hintze JL, Nelson RD (1998) Violin plots: a box plot-density trace synergism. Am Stat 52(2):181–184. doi: 10.1080/00031305.1998.10480559 Google Scholar
  15. Hopcroft J, Khan O, Kulis B, Selman B (2004) Tracking evolving communities in large linked networks. PNAS 101:5249–5253CrossRefGoogle Scholar
  16. Kemp C, Tenenbaum J (2006) Learning systems of concepts with an infinite relational model. In: AAAI’06Google Scholar
  17. Lang KJ (2009) Information theoretic comparison of stochastic graph models: some experiments. In: WAW, pp 1–12Google Scholar
  18. Li Y, Jain A (1998) Classification of text documents. Comput J 41(8):537–546MATHCrossRefGoogle Scholar
  19. Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37:145–151MATHCrossRefGoogle Scholar
  20. Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, CambridgeGoogle Scholar
  21. Nadel SF (1957) The theory of social structure. Cohen & West, LondonGoogle Scholar
  22. Nadif M, Govaert G (2010) Model-based co-clustering for continuous data. In: ICMLA, pp 175–180Google Scholar
  23. Nowicki K, Snijders T (2001) Estimation and prediction for stochastic blockstructures. J Am Stat Assoc 96:1077–1087MATHMathSciNetCrossRefGoogle Scholar
  24. Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435:814–818CrossRefGoogle Scholar
  25. Palla G, Barabási AL, Vicsek T (2007) Quantifying social group evolution. Nature 446:664–667CrossRefGoogle Scholar
  26. Rege M, Dong M, Fotouhi F (2006) Co-clustering documents and words using bipartite isoperimetric graph partitioning. In: ICDM, pp 532–541Google Scholar
  27. Schaeffer S (2007) Graph clustering. Comput Sci Rev 1(1):27–64MATHMathSciNetCrossRefGoogle Scholar
  28. Schepers J, Van Mechelen I, Ceulemans E (2006) Three-mode partitioning. Comput Stat Data Anal 51(3):1623–1642MATHCrossRefGoogle Scholar
  29. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423MATHMathSciNetCrossRefGoogle Scholar
  30. Slonim N, Tishby N (1999) Agglomerative information bottleneck. Adv Neural Inf Process Syst 12:617–623Google Scholar
  31. Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partition. JMLR 3:583–617MATHMathSciNetGoogle Scholar
  32. Sun J, Faloutsos C, Papadimitriou S, Yu P (2007) Graphscope: parameter-free mining of large time-evolving graphs. In: KDD ’07, pp 687–696Google Scholar
  33. Van Mechelen I, Bock HH, De Boeck P (2004) Two-mode clustering methods: a structured overview. Stat Methods Med Res 13(5):363–394MATHMathSciNetCrossRefGoogle Scholar
  34. White DR, Reitz KP (1983) Graph and semigroup homomorphisms on networks of relations. Soc Netw 5(2):193–324MathSciNetCrossRefGoogle Scholar
  35. White H, Boorman S, Breiger R (1976) Social structure from multiple networks: I. Blockmodels of roles and positions. Am J Sociol 81(4):730–780CrossRefGoogle Scholar
  36. Xing EP, Fu W, Song L (2010) A state-space mixed membership blockmodel for dynamic network tomography. Ann Appl Stat 4(2):535–566MATHMathSciNetCrossRefGoogle Scholar
  37. Zhao L, Zaki M (2005) Tricluster: an effective algorithm for mining coherent clusters in 3d microarray data. In: SIGMOD conference, pp 694–705Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Romain Guigourès
    • 1
  • Marc Boullé
    • 1
  • Fabrice Rossi
    • 2
  1. 1.Orange LabsLannionFrance
  2. 2.SAMM EA 45 43Université Paris 1ParisFrance

Personalised recommendations