Advertisement

Computational Statistics

, Volume 32, Issue 2, pp 501–533 | Cite as

The dynamic random subgraph model for the clustering of evolving networks

  • Rawya Zreik
  • Pierre Latouche
  • Charles Bouveyron
Original Paper

Abstract

In recent years, many clustering methods have been proposed to extract information from networks. The principle is to look for groups of vertices with homogenous connection profiles. Most of these techniques are suitable for static networks, that is to say, not taking into account the temporal dimension. This work is motivated by the need of analyzing evolving networks where a decomposition of the networks into subgraphs is given. Therefore, in this paper, we consider the random subgraph model (RSM) which was proposed recently to model networks through latent clusters built within known partitions. Using a state space model to characterize the cluster proportions, RSM is then extended in order to deal with dynamic networks. We call the latter the dynamic random subgraph model (dRSM). A variational expectation maximization (VEM) algorithm is proposed to perform inference. We show that the variational approximations lead to an update step which involves a new state space model from which the parameters along with the hidden states can be estimated using the standard Kalman filter and Rauch–Tung–Striebel smoother. Simulated data sets are considered to assess the proposed methodology. Finally, dRSM along with the corresponding VEM algorithm are applied to an original maritime network built from printed Lloyd’s voyage records.

Keywords

State space model Variational inference Variational expectation maximization Maritime data 

Notes

Acknowledgments

The authors would like to greatly thank César Ducruet, from the Géographie-Cités laboratory, Paris, France, for providing the maritime network and for his painstaking analysis of the results. The data were collected in the context of the ERC Grant No. 313847 “World Seastems” (http://www.world-seastems.cnrs.fr). The authors would like also to thank Catherine Matias and Stéphane Robin for their useful remarks and comments on this work.

References

  1. Ahmed A, Xing EP (2007) On tight approximate inference of logistic-normal admixture model. In: Proceedings of the international conference on artificial intelligence and statistics, pp 1–8Google Scholar
  2. Airoldi E, Blei D, Fienberg S, Xing E (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014zbMATHGoogle Scholar
  3. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723MathSciNetCrossRefzbMATHGoogle Scholar
  4. Albert R, Barabási A (2002) Statistical mechanics of complex networks. Mod Phys 74:47–97MathSciNetCrossRefzbMATHGoogle Scholar
  5. Ambroise C, Grasseau G, Hoebeke M, Latouche P, Miele V, Picard F (2010) The mixer R package (version 1.8). http://cran.r-project.org/web/packages/mixer/
  6. Barabási A, Oltvai Z (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113CrossRefGoogle Scholar
  7. Bickel P, Chen A (2009) A nonparametric view of network models and Newman–Girvan and other modularities. Proc Natl Acad Sci 106(50):21068–21073CrossRefzbMATHGoogle Scholar
  8. Bishop C, Svensén M (2003) Bayesian hierarchical mixtures of experts. In: Kjaerulff U, Meek C (eds) Proceedings of the 19th conference on uncertainty in artificial intelligence, pp 57–64Google Scholar
  9. Blei D, Lafferty J (2007a) A correlated topic model of science. Ann Appl Stat 1:17–35Google Scholar
  10. Blei D, Lafferty J (2007b) A correlated topic model of science. Ann Appl Stat 1(1):17–35MathSciNetCrossRefzbMATHGoogle Scholar
  11. Bouveyron C, Jernite Y, Latouche P, Nouedoui L (2013) The rambo R package (version 1.1). http://cran.r-project.org/web/packages/Rambo/
  12. Côme E, Latouche P (2015) Model selection and clustering in stochastic block models with the exact integrated complete data likelihood. Stat Model. doi: 10.1177/1471082X15577017 MathSciNetGoogle Scholar
  13. Daudin J-J, Picard F, Robin S (2008) A mixture model for random graphs. Stat Comput 18(2):173–183MathSciNetCrossRefGoogle Scholar
  14. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39:1–38Google Scholar
  15. Dubois C, Butts C, Smyth P (2013) Stochastic blockmodelling of relational event dynamics. In: International conference on artificial intelligence and statistics, vol 31 of the J Mach Learn Res Proc, pp 238–246Google Scholar
  16. Ducruet C (2013) Network diversity and maritime flows. J Transp Geogr 30:77–88CrossRefGoogle Scholar
  17. Fienberg S, Wasserman S (1981) Categorical data analysis of single sociometric relations. Sociol Methodol 12:156–192CrossRefGoogle Scholar
  18. Foulds JR, DuBois C, Asuncion AU, Butts CT, Smyth P (2011) A dynamic relational infinite feature model for longitudinal social networks. In: International conference on artificial intelligence and statistics, pp 287–295Google Scholar
  19. Girvan M, Newman M (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821MathSciNetCrossRefzbMATHGoogle Scholar
  20. Handcock M, Raftery A, Tantrum J (2007) Model-based clustering for social networks. J R Stat Soc Ser A (Stat Soc) 170(2):301–354MathSciNetCrossRefGoogle Scholar
  21. Harvey A (1989) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, CambridgeGoogle Scholar
  22. Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probab Lett 4(2):53–56MathSciNetCrossRefzbMATHGoogle Scholar
  23. Heaukulani C, Ghahramani Z (2013) Dynamic probabilistic models for latent feature propagation in social networks. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 275–283Google Scholar
  24. Ho Q, Song L, Xing EP (2011) Evolving cluster mixed-membership blockmodel for time-evolving networks. In: International conference on artificial intelligence and statistics, pp 342–350Google Scholar
  25. Hofman J, Wiggins C (2008) Bayesian approach to network modularity. Phys Rev Lett 100(25):258701CrossRefGoogle Scholar
  26. Jernite Y, Latouche P, Bouveyron C, Rivera P, Jegou L, Lamassé S (2014) The random subgraph model for the analysis of an acclesiastical network in Merovingian Gaul. Ann Appl Stat 8(1):55–74CrossRefzbMATHGoogle Scholar
  27. Jordan M, Ghahramani Z, Jaakkola T, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233CrossRefzbMATHGoogle Scholar
  28. Kemp C, Tenenbaum J, Griffiths T, Yamada T, Ueda N (2006) Learning systems of concepts with an infinite relational model. In: Proceedings of the national conference on artificial intelligence, vol 21, pp 381–391Google Scholar
  29. Kim M, Leskovec J (2013) Nonparametric multi-group membership model for dynamic networks. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 25. MIT Press, Cambridge, pp 1385–1393Google Scholar
  30. Krishnan T, McLachlan G (1997) The EM algorithm and extensions. Wiley, New YorkzbMATHGoogle Scholar
  31. Lafferty JD, Blei DM (2006) Correlated topic models. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 147–154Google Scholar
  32. Latouche P, Birmelé E, Ambroise C (2011) Overlapping stochastic block models with application to the french political blogosphere. Ann Appl Stat 5(1):309–336MathSciNetCrossRefzbMATHGoogle Scholar
  33. Latouche P, Birmelé E, Ambroise C (2012) Variational bayesian inference and complexity control for stochastic block models. Stat Model 12(1):93–115MathSciNetCrossRefGoogle Scholar
  34. Latouche P, Birmelé E, Ambroise C (2014) Model selection in overlapping stochastic block models. Electron J Stat 8(1):762–794MathSciNetCrossRefzbMATHGoogle Scholar
  35. Leroux B (1992) Consistent estimation of amixing distribution. Ann Stat 20:1350–1360CrossRefzbMATHGoogle Scholar
  36. Mariadassou M, Robin S, Vacher C (2010) Uncovering latent structure in valued graphs: a variational approach. Ann Appl Stat 4(2):715–742MathSciNetCrossRefzbMATHGoogle Scholar
  37. Matias C, Robin S (2014) Modeling heterogeneity in random graphs through latent space models: a selective review. ESAIM Proc Surv 47:55–74MathSciNetCrossRefzbMATHGoogle Scholar
  38. Mc Daid A, Murphy T, Friel N, Hurley N (2013) Improved bayesian inference for the stochastic block model with application to large networks. Comput Stat Data Anal 60:12–31MathSciNetCrossRefGoogle Scholar
  39. Minka T (1998) From hidden markov models to linear dynamical systems. Technical report, MITGoogle Scholar
  40. Moreno J (1934) Who shall survive?: A new approach to the problem of human interrelations. Nervous and Mental Disease Publishing CoGoogle Scholar
  41. Nowicki K, Snijders T (2001) Estimation and prediction for stochastic blockstructures. J Am Stat Assoc 96(455):1077–1087MathSciNetCrossRefzbMATHGoogle Scholar
  42. Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435:814–818CrossRefGoogle Scholar
  43. Rand W (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850Google Scholar
  44. Rauch H, Tung F, Striebel T (1965) Maximum likelihood estimates of linear dynamic systems. AIASS J 3(8):1445–1450MathSciNetGoogle Scholar
  45. Rossi F, Villa-Vialaneix N, Hautefeuille F (2014) Exploration of a large database of French notarial acts with social network methods. Digit Mediev 9:1–20Google Scholar
  46. Sarkar P, Moore AW (2005) Dynamic social network analysis using latent space models. ACM SIGKDD Explor Newsl 7(2):31–40CrossRefGoogle Scholar
  47. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464MathSciNetCrossRefzbMATHGoogle Scholar
  48. Svensén M, Bishop C (2004) Robust bayesian mixture modelling. Neurocomputing 64:235–252CrossRefGoogle Scholar
  49. Wang Y, Wong G (1987) Stochastic blockmodels for directed graphs. J Am Stat Assoc 82:8–19MathSciNetCrossRefzbMATHGoogle Scholar
  50. White H, Boorman S, Breiger R (1976) Social structure from multiple networks. I. Blockmodels of roles and positions. Am J Sociol 81:730–780Google Scholar
  51. Xing E, Fu W, Song L (2010) A state-space mixed membership blockmodel for dynamic network tomography. Ann Appl Stat 4(2):535–566MathSciNetCrossRefzbMATHGoogle Scholar
  52. Xu KS (2015) Stochastic block transition models for dynamic networks. In: International conference on artificial intelligence and statistics, pp 1079–1087Google Scholar
  53. Xu KS, Hero III AO (2013) Dynamic stochastic blockmodels: statistical models for time-evolving networks. In: Greenberg AM, Kennedy WG, Bos ND (eds) Social computing, behavioral-cultural modeling and prediction. Springer, Berlin, Heidelberg, pp 201–210Google Scholar
  54. Yang T, Chi Y, Zhu S, Gong Y, Jin R (2011) Detecting communities and their evolutions in dynamic social networks a Bayesian approach. Mach Learn 82(2):157–189MathSciNetCrossRefzbMATHGoogle Scholar
  55. Zanghi H, Volant S, Ambroise C (2010) Clustering based on random graph model embedding vertex features. Pattern Recognit Lett 31(9):830–836CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Rawya Zreik
    • 1
    • 2
  • Pierre Latouche
    • 1
  • Charles Bouveyron
    • 2
  1. 1.Laboratoire SAMM, EA 4543, Université Paris 1 Panthéon-SorbonneParisFrance
  2. 2.Laboratoire MAP5, UMR CNRS 8145, Université Paris Descartes & Sorbonne Paris CitéParisFrance

Personalised recommendations