The dynamic random subgraph model for the clustering of evolving networks

Zreik, Rawya; Latouche, Pierre; Bouveyron, Charles

doi:10.1007/s00180-016-0655-5

The dynamic random subgraph model for the clustering of evolving networks

Original Paper
Published: 28 April 2016

Volume 32, pages 501–533, (2017)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Rawya Zreik^1,2,
Pierre Latouche¹ &
Charles Bouveyron²

629 Accesses
8 Citations
Explore all metrics

Abstract

In recent years, many clustering methods have been proposed to extract information from networks. The principle is to look for groups of vertices with homogenous connection profiles. Most of these techniques are suitable for static networks, that is to say, not taking into account the temporal dimension. This work is motivated by the need of analyzing evolving networks where a decomposition of the networks into subgraphs is given. Therefore, in this paper, we consider the random subgraph model (RSM) which was proposed recently to model networks through latent clusters built within known partitions. Using a state space model to characterize the cluster proportions, RSM is then extended in order to deal with dynamic networks. We call the latter the dynamic random subgraph model (dRSM). A variational expectation maximization (VEM) algorithm is proposed to perform inference. We show that the variational approximations lead to an update step which involves a new state space model from which the parameters along with the hidden states can be estimated using the standard Kalman filter and Rauch–Tung–Striebel smoother. Simulated data sets are considered to assess the proposed methodology. Finally, dRSM along with the corresponding VEM algorithm are applied to an original maritime network built from printed Lloyd’s voyage records.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering Dense Correlated Subgraphs in Dynamic Networks

A Novel Method for Vertex Clustering in Dynamic Networks

A mathematical programming approach for sequential clustering of dynamic networks

Article Open access 15 February 2016

Jonathan C. Silva, Laura Bennett, … Sophia Tsoka

References

Ahmed A, Xing EP (2007) On tight approximate inference of logistic-normal admixture model. In: Proceedings of the international conference on artificial intelligence and statistics, pp 1–8
Airoldi E, Blei D, Fienberg S, Xing E (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014
MATH Google Scholar
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723
Article MathSciNet MATH Google Scholar
Albert R, Barabási A (2002) Statistical mechanics of complex networks. Mod Phys 74:47–97
Article MathSciNet MATH Google Scholar
Ambroise C, Grasseau G, Hoebeke M, Latouche P, Miele V, Picard F (2010) The mixer R package (version 1.8). http://cran.r-project.org/web/packages/mixer/
Barabási A, Oltvai Z (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113
Article Google Scholar
Bickel P, Chen A (2009) A nonparametric view of network models and Newman–Girvan and other modularities. Proc Natl Acad Sci 106(50):21068–21073
Article MATH Google Scholar
Bishop C, Svensén M (2003) Bayesian hierarchical mixtures of experts. In: Kjaerulff U, Meek C (eds) Proceedings of the 19th conference on uncertainty in artificial intelligence, pp 57–64
Blei D, Lafferty J (2007a) A correlated topic model of science. Ann Appl Stat 1:17–35
Blei D, Lafferty J (2007b) A correlated topic model of science. Ann Appl Stat 1(1):17–35
Article MathSciNet MATH Google Scholar
Bouveyron C, Jernite Y, Latouche P, Nouedoui L (2013) The rambo R package (version 1.1). http://cran.r-project.org/web/packages/Rambo/
Côme E, Latouche P (2015) Model selection and clustering in stochastic block models with the exact integrated complete data likelihood. Stat Model. doi:10.1177/1471082X15577017
MathSciNet Google Scholar
Daudin J-J, Picard F, Robin S (2008) A mixture model for random graphs. Stat Comput 18(2):173–183
Article MathSciNet Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39:1–38
Dubois C, Butts C, Smyth P (2013) Stochastic blockmodelling of relational event dynamics. In: International conference on artificial intelligence and statistics, vol 31 of the J Mach Learn Res Proc, pp 238–246
Ducruet C (2013) Network diversity and maritime flows. J Transp Geogr 30:77–88
Article Google Scholar
Fienberg S, Wasserman S (1981) Categorical data analysis of single sociometric relations. Sociol Methodol 12:156–192
Article Google Scholar
Foulds JR, DuBois C, Asuncion AU, Butts CT, Smyth P (2011) A dynamic relational infinite feature model for longitudinal social networks. In: International conference on artificial intelligence and statistics, pp 287–295
Girvan M, Newman M (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821
Article MathSciNet MATH Google Scholar
Handcock M, Raftery A, Tantrum J (2007) Model-based clustering for social networks. J R Stat Soc Ser A (Stat Soc) 170(2):301–354
Article MathSciNet Google Scholar
Harvey A (1989) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge
Google Scholar
Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probab Lett 4(2):53–56
Article MathSciNet MATH Google Scholar
Heaukulani C, Ghahramani Z (2013) Dynamic probabilistic models for latent feature propagation in social networks. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 275–283
Ho Q, Song L, Xing EP (2011) Evolving cluster mixed-membership blockmodel for time-evolving networks. In: International conference on artificial intelligence and statistics, pp 342–350
Hofman J, Wiggins C (2008) Bayesian approach to network modularity. Phys Rev Lett 100(25):258701
Article Google Scholar
Jernite Y, Latouche P, Bouveyron C, Rivera P, Jegou L, Lamassé S (2014) The random subgraph model for the analysis of an acclesiastical network in Merovingian Gaul. Ann Appl Stat 8(1):55–74
Article MATH Google Scholar
Jordan M, Ghahramani Z, Jaakkola T, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233
Article MATH Google Scholar
Kemp C, Tenenbaum J, Griffiths T, Yamada T, Ueda N (2006) Learning systems of concepts with an infinite relational model. In: Proceedings of the national conference on artificial intelligence, vol 21, pp 381–391
Kim M, Leskovec J (2013) Nonparametric multi-group membership model for dynamic networks. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 25. MIT Press, Cambridge, pp 1385–1393
Google Scholar
Krishnan T, McLachlan G (1997) The EM algorithm and extensions. Wiley, New York
MATH Google Scholar
Lafferty JD, Blei DM (2006) Correlated topic models. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 147–154
Google Scholar
Latouche P, Birmelé E, Ambroise C (2011) Overlapping stochastic block models with application to the french political blogosphere. Ann Appl Stat 5(1):309–336
Article MathSciNet MATH Google Scholar
Latouche P, Birmelé E, Ambroise C (2012) Variational bayesian inference and complexity control for stochastic block models. Stat Model 12(1):93–115
Article MathSciNet Google Scholar
Latouche P, Birmelé E, Ambroise C (2014) Model selection in overlapping stochastic block models. Electron J Stat 8(1):762–794
Article MathSciNet MATH Google Scholar
Leroux B (1992) Consistent estimation of amixing distribution. Ann Stat 20:1350–1360
Article MATH Google Scholar
Mariadassou M, Robin S, Vacher C (2010) Uncovering latent structure in valued graphs: a variational approach. Ann Appl Stat 4(2):715–742
Article MathSciNet MATH Google Scholar
Matias C, Robin S (2014) Modeling heterogeneity in random graphs through latent space models: a selective review. ESAIM Proc Surv 47:55–74
Article MathSciNet MATH Google Scholar
Mc Daid A, Murphy T, Friel N, Hurley N (2013) Improved bayesian inference for the stochastic block model with application to large networks. Comput Stat Data Anal 60:12–31
Article MathSciNet Google Scholar
Minka T (1998) From hidden markov models to linear dynamical systems. Technical report, MIT
Moreno J (1934) Who shall survive?: A new approach to the problem of human interrelations. Nervous and Mental Disease Publishing Co
Nowicki K, Snijders T (2001) Estimation and prediction for stochastic blockstructures. J Am Stat Assoc 96(455):1077–1087
Article MathSciNet MATH Google Scholar
Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435:814–818
Article Google Scholar
Rand W (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Rauch H, Tung F, Striebel T (1965) Maximum likelihood estimates of linear dynamic systems. AIASS J 3(8):1445–1450
MathSciNet Google Scholar
Rossi F, Villa-Vialaneix N, Hautefeuille F (2014) Exploration of a large database of French notarial acts with social network methods. Digit Mediev 9:1–20
Google Scholar
Sarkar P, Moore AW (2005) Dynamic social network analysis using latent space models. ACM SIGKDD Explor Newsl 7(2):31–40
Article Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Svensén M, Bishop C (2004) Robust bayesian mixture modelling. Neurocomputing 64:235–252
Article Google Scholar
Wang Y, Wong G (1987) Stochastic blockmodels for directed graphs. J Am Stat Assoc 82:8–19
Article MathSciNet MATH Google Scholar
White H, Boorman S, Breiger R (1976) Social structure from multiple networks. I. Blockmodels of roles and positions. Am J Sociol 81:730–780
Xing E, Fu W, Song L (2010) A state-space mixed membership blockmodel for dynamic network tomography. Ann Appl Stat 4(2):535–566
Article MathSciNet MATH Google Scholar
Xu KS (2015) Stochastic block transition models for dynamic networks. In: International conference on artificial intelligence and statistics, pp 1079–1087
Xu KS, Hero III AO (2013) Dynamic stochastic blockmodels: statistical models for time-evolving networks. In: Greenberg AM, Kennedy WG, Bos ND (eds) Social computing, behavioral-cultural modeling and prediction. Springer, Berlin, Heidelberg, pp 201–210
Yang T, Chi Y, Zhu S, Gong Y, Jin R (2011) Detecting communities and their evolutions in dynamic social networks a Bayesian approach. Mach Learn 82(2):157–189
Article MathSciNet MATH Google Scholar
Zanghi H, Volant S, Ambroise C (2010) Clustering based on random graph model embedding vertex features. Pattern Recognit Lett 31(9):830–836
Article Google Scholar

Download references

Acknowledgments

The authors would like to greatly thank César Ducruet, from the Géographie-Cités laboratory, Paris, France, for providing the maritime network and for his painstaking analysis of the results. The data were collected in the context of the ERC Grant No. 313847 “World Seastems” (http://www.world-seastems.cnrs.fr). The authors would like also to thank Catherine Matias and Stéphane Robin for their useful remarks and comments on this work.

Author information

Authors and Affiliations

Laboratoire SAMM, EA 4543, Université Paris 1 Panthéon-Sorbonne, Paris, France
Rawya Zreik & Pierre Latouche
Laboratoire MAP5, UMR CNRS 8145, Université Paris Descartes & Sorbonne Paris Cité, Paris, France
Rawya Zreik & Charles Bouveyron

Authors

Rawya Zreik
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Latouche
View author publications
You can also search for this author in PubMed Google Scholar
Charles Bouveyron
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierre Latouche.

Appendices

Appendix 1: Construction of a tractable lower bound

We rely on a bound introduced in Jordan et al. (1999). Such a general bound can easily been derived by noticing that $C(\cdot )$ is a concave function of $\sum _{l=1}^{K}\exp (\gamma _{s_{i}l}^{(t)})$ and therefore a first order Taylor expansion of the normalizing constant, at any $\xi _{s}^{(t)}\in {\mathbb {R}}^{*+}$, will lead to the inequality:

$$\begin{aligned} \log \left( \sum _{l=1}^{K}\exp \left( \gamma _{sl}^{\left( t\right) }\right) \right) \le \xi _{s}^{-1\left( t\right) }\left( \sum _{l=1}^{K}\exp \left( \gamma _{sl}^{\left( t\right) }\right) \right) -1+\log \left( \xi _{s}^{\left( t\right) }\right) . \end{aligned}$$

(8)

The bounds (8) on the $C(\gamma _{s}^{(t)})$ terms induce a lower bound on the quantity $\log p(Z|\gamma )$:

$$\begin{aligned} \log p(Z|\gamma )= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}Z_{ik}^{(t)}\log \Big (f_k\Big (\gamma _{s_{i}}^{(t)}\Big )\Big )\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}\sum _{s=1}^{S} y_{is}Z_{ik}^{(t)}\Big (\gamma _{sk}^{(t)}-\log \Big (\sum _{l=1}^{K}\exp \Big (\gamma _{sl}^{(t)}\Big )\Big )\Big )\\\ge & {} \log h(Z,\gamma ,\xi ), \end{aligned}$$

where $\xi $ denotes the set of all variational parameters $(\xi _s^{(t)})_{st}$ and the function $h(\cdot ,\cdot ,\cdot )$ is such that:

$$\begin{aligned}&\log h(Z,\gamma ,\xi )\\&\quad = \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N} \sum _{s=1}^{S} y_{is}Z_{ik}^{(t)}\Big (\gamma _{sk}^{(t)}-\big (\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \big (\gamma _{sl}^{(t)}\big )-1+\log \big (\xi _{s}^{(t)}\big )\big )\Big ) . \end{aligned}$$

Replacing $\log p(Z|\gamma )$ by $\log h(Z,\gamma ,\xi )$ in $ {\mathcal {L}} (q, \theta ) $, leads to a new lower bound $\tilde{{\mathcal {L}}}(q,\theta ,\xi )$ for $\log p(X|\theta )$ which satisfies:

$$\begin{aligned} \log p(X|\theta )\geqslant {\mathcal {L}}(q,\theta )\geqslant \tilde{{\mathcal {L}}}(q,\theta ,\xi ), \end{aligned}$$

where

$$\begin{aligned}&\tilde{{\mathcal {L}}}(q,\theta ,\xi )\\&\quad =\sum _{Z}\int _{\gamma }\int _{\nu }q(Z,\gamma ,\nu )\log \dfrac{p(X|Z,\Pi )h(Z,\gamma ,\xi )p(\gamma |B,\nu ,\Sigma )p(\nu |\mu _{0},A,\Phi ,V_{0})}{q(Z,\gamma ,\nu )} d\gamma \, d\nu . \end{aligned}$$

Appendix 2: E-step of the VEM algorithm

1.1 Distribution q(Z)

The VEM update step for each of the distributions $q(Z_{i})$ in q(Z) is given by:

$$\begin{aligned} \log q(Z_{i})= & {} E_{\gamma ,\nu ,Z^{\backslash i}}[\log p(X|Z,\Pi )+\log h(Z,\gamma ,\xi )]+\mathrm {const}\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\left( \sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\right) \\&+\,\sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{s=1}^{S}y_{is}E_{\gamma }\Big [Z_{ik}^{(t)}\log h(Z^{(t)},\gamma ^{(t)},\xi ^{(t)})\Big ]+\mathrm {const}.\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\left( \sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\right) \\&+\,\sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{s=1}^{S}y_{is}E_{\gamma }\Big [\gamma _{sk}^{(t)}-\Big (\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp (\gamma _{sk}^{(t)})-1+\log (\xi _{s}^{(t)})\Big )\Big ]\\&+\,\mathrm {const}.\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\left( \sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\right) \\&+\,\sum _{t=1}^{T}\sum _{k=1}^{K} \sum _{s=1}^{S} Z_{ik}^{(t)}y_{is}\left( \hat{\gamma }_{sk}^{(t)}\!-\!\Big [\xi _{s}^{-1(t)}\sum _{l=1}^{K}E(\exp (\gamma _{sl}^{(t)}))\!-\!1\!+\!\log (\xi _{s}^{(t)})\Big ]\right) \\&+\,\mathrm {const}.\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\Bigg (\sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\\&+\, \sum _{s=1}^{S} y_{is} \Big ( \hat{\gamma }_{sk}^{(t)}-\Big (\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \Big (\hat{\gamma }_{sl}^{(t)}+\dfrac{\hat{\sigma }_{sl}^{2^{(t)}}}{2}\Big )-1+\log (\xi _{s}^{(t)})\Big ) \Big )\Bigg )\\&+\,\mathrm {const}, \end{aligned}$$

where all terms that do not depend on $Z_{i}$ have been put into the constant terms $\mathrm {const}$. Moreover since $\gamma _{sk}^{(t)}\sim {\mathcal {N}}(\hat{\gamma }_{sk}^{(t)},\hat{\sigma }_{sk}^{2^{(t)}})$ we have used:

$$\begin{aligned} {\mathbb {E}}\left[ \exp \left( \gamma _{sk}^{\left( t\right) }\right) \right] =\exp \left( \hat{\gamma }_{sk}^{\left( t\right) }+\dfrac{\hat{\sigma }_{sk}^{2^{\left( t\right) }}}{2}\right) . \end{aligned}$$

We then recognize the functional form of a multinomial distribution:

$$\begin{aligned} q\left( Z_{i}^{\left( t\right) }\right) \sim {\mathcal {M}}\left( Z_{i}^{\left( t\right) };1,\tau _{i}^{\left( t\right) }\right) ,\,\,\forall i,t. \end{aligned}$$

1.2 Distribution $q(\nu )$

The VEM update step for the distribution $q(\nu )$ is given by:

$$\begin{aligned} \log q(\nu )= & {} E_{Z,\gamma }\Big (\log p(\gamma |\nu ,\Sigma ,B)+\log p(\nu |\mu _{0},V_{0},A,\Phi )\Big )+\mathrm {const}\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (E_{\gamma }\big (\log {\mathcal {N}}(\gamma _{s}^{(t)};B\nu ^{(t)},\Sigma )\big )\Big )+\log p(\nu ^{(1)}|\mu _{0},V_{0})\\&+\, \sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (E_{\gamma }\big (-\dfrac{1}{2}(\gamma _{s}^{(t)})^{\intercal }\Sigma ^{-1}(\gamma _{s}^{(t)})+(\gamma _{s}^{(t)})^{\intercal }\Sigma ^{-1}B\nu ^{(t)}\\&-\,\dfrac{1}{2}(\nu ^{(t)})^{\intercal }B^{\intercal }\Sigma ^{-1}B\nu ^{(t)}\big )\Big )\\&+\, \log p(\nu ^{(1)}|\mu _{0},V_{0})+\sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}.\\= & {} \sum _{t=1}^{T}\Big ( \sum _{s=1}^{S}\Big (\hat{\gamma }_{s}^{(t)}\Sigma ^{-1}B\nu ^{(t)}\Big ) -\dfrac{1}{2}(\nu ^{(t)})^{\intercal }B^{\intercal }(S\Sigma ^{-1})B\nu ^{(t)} \Big ) \\&+\, \log p(\nu ^{(1)}|\mu _{0},V_{0})+\sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}, \end{aligned}$$

where all terms that do not depend on $\nu $ have been put into the constant terms $\mathrm {const}$. We recognize the functional form of the posterior distribution of a linear dynamic system:

$$\begin{aligned} \log q(\nu )= & {} \sum _{t=1}^{T}\Big (\log {\mathcal {N}}\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S};B\nu ^{(t)},\dfrac{\Sigma }{S}\Big )\Big )\\&+\, \log p(\nu ^{(1)}|\mu _{0},V_{0})+\sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}. \end{aligned}$$

Appendix 3: Derivation of the lower bound

In the following, we denote $x^{(t)}=\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S}$ an observed variable. The lower bound is given by:

$$\begin{aligned}&\tilde{{\mathcal {L}}}\left( q,\theta ,\xi \right) \\&\quad = \sum _{Z}\int _{\gamma }\int _{\nu }q\left( Z,\gamma ,\nu \right) \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma \right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( Z,\gamma ,\nu \right) } d\nu d\gamma \\&\quad = E_{Z,\gamma ,\nu }\left[ \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma ,B\right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( \gamma \right) q\left( \nu \right) \prod _{i=1}^{N}q\left( Z_{i}\right) }\right] \\&\quad = E_{Z}\left( \log p\left( X|Z,\Pi \right) \right) +E_{Z,\gamma }\left( \log h\left( Z,\gamma ,\xi \right) \right) +E_{\gamma ,\nu }\left( \log p\left( \gamma |\nu ,\Sigma ,B\right) \right) \\&\qquad + E_{\nu }\left( \log p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) \right) -E_{\gamma }\left( \log q\left( \gamma \right) \right) -E_{\nu }\left( \log q\left( \nu \right) \right) \\&\qquad -E_{Z}\left( \log \left( \prod _{i=1}^{N}q\left( Z_{i}\right) \right) \right) . \end{aligned}$$

Note that (see Proposition 3.3),

$$\begin{aligned} q(\nu )\propto p(\nu ^{(1)}|\mu _{0},V_{0})\Big [\prod _{t=2}^{T}p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )\Big ]\Big [\prod _{t=1}^{T}{\mathcal {N}}\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S};B\nu ^{(t)},\dfrac{\Sigma }{S}\Big )\Big ]. \end{aligned}$$

As pointed out in this proposition, this corresponds to the form of the posterior distribution associated with a state space model with parameter $\theta ^{'}$ and with observed outputs $x=(x^{(t)})_{t}$. If we denote $p(x|\theta ^{'})$ the likelihood associated with this model, and the joint likelihood $p(x,\nu |\theta ^{'})$, we have

$$\begin{aligned} q(\nu ) = \frac{p(x,\nu |\theta ^{'})}{ p(x|\theta ^{'})}. \end{aligned}$$

Therefore

$$\begin{aligned}&E_{\nu }\left( \log q\left( \nu \right) \right) =E_{\nu }\left( \log p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) \right) \\&\quad +E_{\nu }\left( \log p\left( \dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{\left( t\right) }}{S}|\nu ^{\left( t\right) },\dfrac{\Sigma }{S},B\right) \right) - \log p\left( x|\theta ^{'}\right) . \end{aligned}$$

This leads to,

$$\begin{aligned}&E_{\nu }\left( \log p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) \right) -E_{\nu }\left( \log q\left( \nu \right) \right) \\&\quad =-E_{\nu }\left( \log p\left( \dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{\left( t\right) }}{S}|\nu ^{\left( t\right) },\dfrac{\Sigma }{S},B\right) \right) +\log p\left( x|\theta ^{'}\right) , \end{aligned}$$

and $\tilde{{\mathcal {L}}}(q,\theta ,\xi )$ can be written as follows:

$$\begin{aligned} \tilde{{\mathcal {L}}}\left( q,\theta ,\xi \right)= & {} \sum _{Z}\int _{\gamma }\int _{\nu }q\left( Z,\gamma ,\nu \right) \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma \right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( Z,\gamma ,\nu \right) }\\= & {} E_{Z,\gamma ,\nu }\left[ \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma ,B\right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( \gamma \right) q\left( \nu \right) \prod _{i=1}^{N}q\left( Z_{i}\right) }\right] \\= & {} E_{Z}\left( \log p\left( X|Z,\Pi \right) \right) +E_{Z,\gamma }\left( \log h\left( Z,\gamma ,\xi \right) \right) +E_{\gamma ,\nu }\left( \log p\left( \gamma |\nu ,\Sigma ,B\right) \right) \\&-\, E_{\gamma }\left( \log q\left( \gamma \right) \right) -E_{\nu }\left( \log p\left( \dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{\left( t\right) }}{S}|\nu ^{\left( t\right) },\dfrac{\Sigma }{S},B\right) \right) -E_{Z}\left( \log \left( \prod _{i=1}^{N}q\left( Z_{i}\right) \right) \right) \\&+\, \log p\left( x|\theta ^{'}\right) . \end{aligned}$$

We explicit below each of the terms of the bound $\tilde{{\mathcal {L}}}(q,\theta )$.

1.
$E_{Z}(\log p(X|Z,\Pi ))$:
$$\begin{aligned} E_{Z}(\log p(X|Z,\Pi ))= & {} \sum _{t=1}^{T}\sum _{k,l}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}E_{z}(\delta (X_{ij}^{(t)}=c)Z_{ik}^{(t)}Z_{jl}^{(t)}\log (\Pi _{kl}^{c})\\= & {} \sum _{t=1}^{T}\sum _{k,l}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{ik}^{(t)}\tau _{jl}^{(t)}\log (\Pi _{kl}^{c}) \end{aligned}$$
2.
$E_{Z,\gamma }(\log h(Z,\gamma ,\xi ))$:
$$\begin{aligned}&E_{Z,\gamma }\Big (\log h(Z,\gamma ,\xi )\Big ) \\&\quad = E_{Z,\gamma }\Big [\sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}\sum _{s=1}^{S} y_{is} Z_{ik}^{(t)}\Big (\gamma _{sk}^{(t)}-\big (\xi _{s}^{-1(t)}\sum _{l}\exp (\gamma _{sk}^{(t)})\\&\qquad - 1+\log (\xi _{s}^{(t)})\big )\Big )\Big ]\\&\quad = \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}\sum _{s=1}^{S} y_{is} \Big (\tau _{ik}^{(t)}\hat{\gamma }_{sk}^{(t)}-\tau _{ik}^{(t)}\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \Big (\hat{\gamma }_{sl}^{(t)}+\dfrac{\hat{\sigma }_{sl}^{2^{(t)}}}{2}\Big )\\&\qquad + \tau _{ik}^{(t)}-\tau _{ik}^{(t)}\log (\xi _{s}^{(t)})\Big )\\&\quad = \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (r_{s}^{(t)}\hat{\gamma }_{sk}^{(t)}-N_{s}\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \Big (\hat{\gamma }_{sl}^{(t)}+\dfrac{\hat{\sigma }_{sl}^{2^{(t)}}}{2}\Big )+N_{s}-N_{s}\log (\xi _{s}^{(t)})\Big ) \end{aligned}$$
where denote $r_{s}^{(t)}$ is a quantity $\sum _{i=1}^{N}\tau _{ik}^{(t)}y_{is}$.
3.
$E_{\gamma ,\nu }(\log p(\gamma |\nu ,\Sigma ,B))$:
$$\begin{aligned} E_{\gamma ,\nu }(\log p(\gamma |\nu ,\Sigma ,B))= & {} E_{\gamma ,\nu }\Big (\log \prod _{t=1}^{T}\prod _{s=1}^{S}{\mathcal {N}}(\gamma _{s}^{(t)};B\nu _{s}^{(t)},\Sigma )\Big )\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (\log {\mathcal {N}}(\hat{\gamma }_{s}^{(t)},B\hat{\nu }_{s}^{(t)},\Sigma )-\dfrac{1}{2}tr(\Sigma ^{-1}B^{T}\hat{V}^{(t)}B)\\&-\,\dfrac{1}{2}tr\Big (\Sigma ^{-1}\hat{\sigma }_{s}^{(t)^{2}}\Big )\Big ) \end{aligned}$$
4.
$E_{\gamma }(\log q(\gamma ))$:
$$\begin{aligned} E_{\gamma }(\log q(\gamma ))= & {} E_{\gamma }\Big (\prod _{t=1}^{T}\prod _{s=1}^{S}\prod _{k=1}^{K}{\mathcal {N}}(\gamma _{sk}^{(t)};\hat{\gamma }_{sk}^{(t)},\hat{\sigma }_{sk}^{2^{(t)}})\Big )\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\sum _{k=1}^{K} -\log \Big ( (2\pi )^{\frac{1}{2}}\hat{\sigma }_{sk}^{(t)}\Big )-\dfrac{TKS}{2}. \end{aligned}$$
5.
$E_{\nu }\Big (\log p\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S}|\nu ^{(t)},\dfrac{\Sigma }{S},B\Big )\Big )$ :
$$\begin{aligned}&E_{\nu }\Big (\log p\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S}|\nu ^{(t)},\dfrac{\Sigma }{S},B\Big )\Big )\\&\quad = \sum _{t=1}^{T}\Big (\log {\mathcal {N}}(x^{(t)};B\hat{\nu }^{(t)},\Sigma /S)-\dfrac{1}{2}tr(\Sigma ^{-1}SB^{T}\hat{V}^{(t)}B)\Big ). \end{aligned}$$
6.
$E_{Z}(\log (\prod _{i=1}^{T}q(Z_{i})))$:

$$\begin{aligned} E_{Z}\left( \log \left( \prod _{i=1}^{T}q\left( Z_{i}\right) \right) \right)= & {} \sum _{i=1}^{N}E_{Z}\left( \log q\left( Z_{i}\right) \right) \\= & {} \sum _{i=1}^{N}E_{Z}\left( \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{\left( t\right) }\log \left( \tau _{ik}\right) \right) \\= & {} \sum _{i=1}^{N}\sum _{t=1}^{T}\sum _{k=1}^{K}\tau _{ik}^{\left( t\right) }\log \left( \tau _{ik}^{\left( t\right) }\right) . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zreik, R., Latouche, P. & Bouveyron, C. The dynamic random subgraph model for the clustering of evolving networks. Comput Stat 32, 501–533 (2017). https://doi.org/10.1007/s00180-016-0655-5

Download citation

Received: 21 July 2015
Accepted: 18 March 2016
Published: 28 April 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s00180-016-0655-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The dynamic random subgraph model for the clustering of evolving networks

Abstract

Access this article

Similar content being viewed by others

Discovering Dense Correlated Subgraphs in Dynamic Networks

A Novel Method for Vertex Clustering in Dynamic Networks

A mathematical programming approach for sequential clustering of dynamic networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Construction of a tractable lower bound

Appendix 2: E-step of the VEM algorithm

1.1 Distribution q(Z)

1.2 Distribution \(q(\nu )\)

Appendix 3: Derivation of the lower bound

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The dynamic random subgraph model for the clustering of evolving networks

Abstract

Access this article

Similar content being viewed by others

Discovering Dense Correlated Subgraphs in Dynamic Networks

A Novel Method for Vertex Clustering in Dynamic Networks

A mathematical programming approach for sequential clustering of dynamic networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Construction of a tractable lower bound

Appendix 2: E-step of the VEM algorithm

1.1 Distribution q(Z)

1.2 Distribution \(q(\nu )\)

Appendix 3: Derivation of the lower bound

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation