Skip to main content
Log in

The dynamic random subgraph model for the clustering of evolving networks

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In recent years, many clustering methods have been proposed to extract information from networks. The principle is to look for groups of vertices with homogenous connection profiles. Most of these techniques are suitable for static networks, that is to say, not taking into account the temporal dimension. This work is motivated by the need of analyzing evolving networks where a decomposition of the networks into subgraphs is given. Therefore, in this paper, we consider the random subgraph model (RSM) which was proposed recently to model networks through latent clusters built within known partitions. Using a state space model to characterize the cluster proportions, RSM is then extended in order to deal with dynamic networks. We call the latter the dynamic random subgraph model (dRSM). A variational expectation maximization (VEM) algorithm is proposed to perform inference. We show that the variational approximations lead to an update step which involves a new state space model from which the parameters along with the hidden states can be estimated using the standard Kalman filter and Rauch–Tung–Striebel smoother. Simulated data sets are considered to assess the proposed methodology. Finally, dRSM along with the corresponding VEM algorithm are applied to an original maritime network built from printed Lloyd’s voyage records.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Ahmed A, Xing EP (2007) On tight approximate inference of logistic-normal admixture model. In: Proceedings of the international conference on artificial intelligence and statistics, pp 1–8

  • Airoldi E, Blei D, Fienberg S, Xing E (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014

    MATH  Google Scholar 

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723

    Article  MathSciNet  MATH  Google Scholar 

  • Albert R, Barabási A (2002) Statistical mechanics of complex networks. Mod Phys 74:47–97

    Article  MathSciNet  MATH  Google Scholar 

  • Ambroise C, Grasseau G, Hoebeke M, Latouche P, Miele V, Picard F (2010) The mixer R package (version 1.8). http://cran.r-project.org/web/packages/mixer/

  • Barabási A, Oltvai Z (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113

    Article  Google Scholar 

  • Bickel P, Chen A (2009) A nonparametric view of network models and Newman–Girvan and other modularities. Proc Natl Acad Sci 106(50):21068–21073

    Article  MATH  Google Scholar 

  • Bishop C, Svensén M (2003) Bayesian hierarchical mixtures of experts. In: Kjaerulff U, Meek C (eds) Proceedings of the 19th conference on uncertainty in artificial intelligence, pp 57–64

  • Blei D, Lafferty J (2007a) A correlated topic model of science. Ann Appl Stat 1:17–35

  • Blei D, Lafferty J (2007b) A correlated topic model of science. Ann Appl Stat 1(1):17–35

    Article  MathSciNet  MATH  Google Scholar 

  • Bouveyron C, Jernite Y, Latouche P, Nouedoui L (2013) The rambo R package (version 1.1). http://cran.r-project.org/web/packages/Rambo/

  • Côme E, Latouche P (2015) Model selection and clustering in stochastic block models with the exact integrated complete data likelihood. Stat Model. doi:10.1177/1471082X15577017

    MathSciNet  Google Scholar 

  • Daudin J-J, Picard F, Robin S (2008) A mixture model for random graphs. Stat Comput 18(2):173–183

    Article  MathSciNet  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39:1–38

  • Dubois C, Butts C, Smyth P (2013) Stochastic blockmodelling of relational event dynamics. In: International conference on artificial intelligence and statistics, vol 31 of the J Mach Learn Res Proc, pp 238–246

  • Ducruet C (2013) Network diversity and maritime flows. J Transp Geogr 30:77–88

    Article  Google Scholar 

  • Fienberg S, Wasserman S (1981) Categorical data analysis of single sociometric relations. Sociol Methodol 12:156–192

    Article  Google Scholar 

  • Foulds JR, DuBois C, Asuncion AU, Butts CT, Smyth P (2011) A dynamic relational infinite feature model for longitudinal social networks. In: International conference on artificial intelligence and statistics, pp 287–295

  • Girvan M, Newman M (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821

    Article  MathSciNet  MATH  Google Scholar 

  • Handcock M, Raftery A, Tantrum J (2007) Model-based clustering for social networks. J R Stat Soc Ser A (Stat Soc) 170(2):301–354

    Article  MathSciNet  Google Scholar 

  • Harvey A (1989) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge

    Google Scholar 

  • Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probab Lett 4(2):53–56

    Article  MathSciNet  MATH  Google Scholar 

  • Heaukulani C, Ghahramani Z (2013) Dynamic probabilistic models for latent feature propagation in social networks. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 275–283

  • Ho Q, Song L, Xing EP (2011) Evolving cluster mixed-membership blockmodel for time-evolving networks. In: International conference on artificial intelligence and statistics, pp 342–350

  • Hofman J, Wiggins C (2008) Bayesian approach to network modularity. Phys Rev Lett 100(25):258701

    Article  Google Scholar 

  • Jernite Y, Latouche P, Bouveyron C, Rivera P, Jegou L, Lamassé S (2014) The random subgraph model for the analysis of an acclesiastical network in Merovingian Gaul. Ann Appl Stat 8(1):55–74

    Article  MATH  Google Scholar 

  • Jordan M, Ghahramani Z, Jaakkola T, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233

    Article  MATH  Google Scholar 

  • Kemp C, Tenenbaum J, Griffiths T, Yamada T, Ueda N (2006) Learning systems of concepts with an infinite relational model. In: Proceedings of the national conference on artificial intelligence, vol 21, pp 381–391

  • Kim M, Leskovec J (2013) Nonparametric multi-group membership model for dynamic networks. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 25. MIT Press, Cambridge, pp 1385–1393

    Google Scholar 

  • Krishnan T, McLachlan G (1997) The EM algorithm and extensions. Wiley, New York

    MATH  Google Scholar 

  • Lafferty JD, Blei DM (2006) Correlated topic models. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 147–154

    Google Scholar 

  • Latouche P, Birmelé E, Ambroise C (2011) Overlapping stochastic block models with application to the french political blogosphere. Ann Appl Stat 5(1):309–336

    Article  MathSciNet  MATH  Google Scholar 

  • Latouche P, Birmelé E, Ambroise C (2012) Variational bayesian inference and complexity control for stochastic block models. Stat Model 12(1):93–115

    Article  MathSciNet  Google Scholar 

  • Latouche P, Birmelé E, Ambroise C (2014) Model selection in overlapping stochastic block models. Electron J Stat 8(1):762–794

    Article  MathSciNet  MATH  Google Scholar 

  • Leroux B (1992) Consistent estimation of amixing distribution. Ann Stat 20:1350–1360

    Article  MATH  Google Scholar 

  • Mariadassou M, Robin S, Vacher C (2010) Uncovering latent structure in valued graphs: a variational approach. Ann Appl Stat 4(2):715–742

    Article  MathSciNet  MATH  Google Scholar 

  • Matias C, Robin S (2014) Modeling heterogeneity in random graphs through latent space models: a selective review. ESAIM Proc Surv 47:55–74

    Article  MathSciNet  MATH  Google Scholar 

  • Mc Daid A, Murphy T, Friel N, Hurley N (2013) Improved bayesian inference for the stochastic block model with application to large networks. Comput Stat Data Anal 60:12–31

    Article  MathSciNet  Google Scholar 

  • Minka T (1998) From hidden markov models to linear dynamical systems. Technical report, MIT

  • Moreno J (1934) Who shall survive?: A new approach to the problem of human interrelations. Nervous and Mental Disease Publishing Co

  • Nowicki K, Snijders T (2001) Estimation and prediction for stochastic blockstructures. J Am Stat Assoc 96(455):1077–1087

    Article  MathSciNet  MATH  Google Scholar 

  • Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435:814–818

    Article  Google Scholar 

  • Rand W (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850

  • Rauch H, Tung F, Striebel T (1965) Maximum likelihood estimates of linear dynamic systems. AIASS J 3(8):1445–1450

    MathSciNet  Google Scholar 

  • Rossi F, Villa-Vialaneix N, Hautefeuille F (2014) Exploration of a large database of French notarial acts with social network methods. Digit Mediev 9:1–20

    Google Scholar 

  • Sarkar P, Moore AW (2005) Dynamic social network analysis using latent space models. ACM SIGKDD Explor Newsl 7(2):31–40

    Article  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Svensén M, Bishop C (2004) Robust bayesian mixture modelling. Neurocomputing 64:235–252

    Article  Google Scholar 

  • Wang Y, Wong G (1987) Stochastic blockmodels for directed graphs. J Am Stat Assoc 82:8–19

    Article  MathSciNet  MATH  Google Scholar 

  • White H, Boorman S, Breiger R (1976) Social structure from multiple networks. I. Blockmodels of roles and positions. Am J Sociol 81:730–780

  • Xing E, Fu W, Song L (2010) A state-space mixed membership blockmodel for dynamic network tomography. Ann Appl Stat 4(2):535–566

    Article  MathSciNet  MATH  Google Scholar 

  • Xu KS (2015) Stochastic block transition models for dynamic networks. In: International conference on artificial intelligence and statistics, pp 1079–1087

  • Xu KS, Hero III AO (2013) Dynamic stochastic blockmodels: statistical models for time-evolving networks. In: Greenberg AM, Kennedy WG, Bos ND (eds) Social computing, behavioral-cultural modeling and prediction. Springer, Berlin, Heidelberg, pp 201–210

  • Yang T, Chi Y, Zhu S, Gong Y, Jin R (2011) Detecting communities and their evolutions in dynamic social networks a Bayesian approach. Mach Learn 82(2):157–189

    Article  MathSciNet  MATH  Google Scholar 

  • Zanghi H, Volant S, Ambroise C (2010) Clustering based on random graph model embedding vertex features. Pattern Recognit Lett 31(9):830–836

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to greatly thank César Ducruet, from the Géographie-Cités laboratory, Paris, France, for providing the maritime network and for his painstaking analysis of the results. The data were collected in the context of the ERC Grant No. 313847 “World Seastems” (http://www.world-seastems.cnrs.fr). The authors would like also to thank Catherine Matias and Stéphane Robin for their useful remarks and comments on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierre Latouche.

Appendices

Appendix 1: Construction of a tractable lower bound

We rely on a bound introduced in Jordan et al. (1999). Such a general bound can easily been derived by noticing that \(C(\cdot )\) is a concave function of \(\sum _{l=1}^{K}\exp (\gamma _{s_{i}l}^{(t)})\) and therefore a first order Taylor expansion of the normalizing constant, at any \(\xi _{s}^{(t)}\in {\mathbb {R}}^{*+}\), will lead to the inequality:

$$\begin{aligned} \log \left( \sum _{l=1}^{K}\exp \left( \gamma _{sl}^{\left( t\right) }\right) \right) \le \xi _{s}^{-1\left( t\right) }\left( \sum _{l=1}^{K}\exp \left( \gamma _{sl}^{\left( t\right) }\right) \right) -1+\log \left( \xi _{s}^{\left( t\right) }\right) . \end{aligned}$$
(8)

The bounds (8) on the \(C(\gamma _{s}^{(t)})\) terms induce a lower bound on the quantity \(\log p(Z|\gamma )\):

$$\begin{aligned} \log p(Z|\gamma )= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}Z_{ik}^{(t)}\log \Big (f_k\Big (\gamma _{s_{i}}^{(t)}\Big )\Big )\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}\sum _{s=1}^{S} y_{is}Z_{ik}^{(t)}\Big (\gamma _{sk}^{(t)}-\log \Big (\sum _{l=1}^{K}\exp \Big (\gamma _{sl}^{(t)}\Big )\Big )\Big )\\\ge & {} \log h(Z,\gamma ,\xi ), \end{aligned}$$

where \(\xi \) denotes the set of all variational parameters \((\xi _s^{(t)})_{st}\) and the function \(h(\cdot ,\cdot ,\cdot )\) is such that:

$$\begin{aligned}&\log h(Z,\gamma ,\xi )\\&\quad = \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N} \sum _{s=1}^{S} y_{is}Z_{ik}^{(t)}\Big (\gamma _{sk}^{(t)}-\big (\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \big (\gamma _{sl}^{(t)}\big )-1+\log \big (\xi _{s}^{(t)}\big )\big )\Big ) . \end{aligned}$$

Replacing \(\log p(Z|\gamma )\) by \(\log h(Z,\gamma ,\xi )\) in \( {\mathcal {L}} (q, \theta ) \), leads to a new lower bound \(\tilde{{\mathcal {L}}}(q,\theta ,\xi )\) for \(\log p(X|\theta )\) which satisfies:

$$\begin{aligned} \log p(X|\theta )\geqslant {\mathcal {L}}(q,\theta )\geqslant \tilde{{\mathcal {L}}}(q,\theta ,\xi ), \end{aligned}$$

where

$$\begin{aligned}&\tilde{{\mathcal {L}}}(q,\theta ,\xi )\\&\quad =\sum _{Z}\int _{\gamma }\int _{\nu }q(Z,\gamma ,\nu )\log \dfrac{p(X|Z,\Pi )h(Z,\gamma ,\xi )p(\gamma |B,\nu ,\Sigma )p(\nu |\mu _{0},A,\Phi ,V_{0})}{q(Z,\gamma ,\nu )} d\gamma \, d\nu . \end{aligned}$$

Appendix 2: E-step of the VEM algorithm

1.1 Distribution q(Z)

The VEM update step for each of the distributions \(q(Z_{i})\) in q(Z) is given by:

$$\begin{aligned} \log q(Z_{i})= & {} E_{\gamma ,\nu ,Z^{\backslash i}}[\log p(X|Z,\Pi )+\log h(Z,\gamma ,\xi )]+\mathrm {const}\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\left( \sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\right) \\&+\,\sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{s=1}^{S}y_{is}E_{\gamma }\Big [Z_{ik}^{(t)}\log h(Z^{(t)},\gamma ^{(t)},\xi ^{(t)})\Big ]+\mathrm {const}.\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\left( \sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\right) \\&+\,\sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{s=1}^{S}y_{is}E_{\gamma }\Big [\gamma _{sk}^{(t)}-\Big (\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp (\gamma _{sk}^{(t)})-1+\log (\xi _{s}^{(t)})\Big )\Big ]\\&+\,\mathrm {const}.\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\left( \sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\right) \\&+\,\sum _{t=1}^{T}\sum _{k=1}^{K} \sum _{s=1}^{S} Z_{ik}^{(t)}y_{is}\left( \hat{\gamma }_{sk}^{(t)}\!-\!\Big [\xi _{s}^{-1(t)}\sum _{l=1}^{K}E(\exp (\gamma _{sl}^{(t)}))\!-\!1\!+\!\log (\xi _{s}^{(t)})\Big ]\right) \\&+\,\mathrm {const}.\\= & {} \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{(t)}\Bigg (\sum _{l=1}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{jl}^{(t)}\Big [\log (\Pi _{kl}^{c})+\log (\Pi _{lk}^{c})\Big ]\\&+\, \sum _{s=1}^{S} y_{is} \Big ( \hat{\gamma }_{sk}^{(t)}-\Big (\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \Big (\hat{\gamma }_{sl}^{(t)}+\dfrac{\hat{\sigma }_{sl}^{2^{(t)}}}{2}\Big )-1+\log (\xi _{s}^{(t)})\Big ) \Big )\Bigg )\\&+\,\mathrm {const}, \end{aligned}$$

where all terms that do not depend on \(Z_{i}\) have been put into the constant terms \(\mathrm {const}\). Moreover since \(\gamma _{sk}^{(t)}\sim {\mathcal {N}}(\hat{\gamma }_{sk}^{(t)},\hat{\sigma }_{sk}^{2^{(t)}})\) we have used:

$$\begin{aligned} {\mathbb {E}}\left[ \exp \left( \gamma _{sk}^{\left( t\right) }\right) \right] =\exp \left( \hat{\gamma }_{sk}^{\left( t\right) }+\dfrac{\hat{\sigma }_{sk}^{2^{\left( t\right) }}}{2}\right) . \end{aligned}$$

We then recognize the functional form of a multinomial distribution:

$$\begin{aligned} q\left( Z_{i}^{\left( t\right) }\right) \sim {\mathcal {M}}\left( Z_{i}^{\left( t\right) };1,\tau _{i}^{\left( t\right) }\right) ,\,\,\forall i,t. \end{aligned}$$

1.2 Distribution \(q(\nu )\)

The VEM update step for the distribution \(q(\nu )\) is given by:

$$\begin{aligned} \log q(\nu )= & {} E_{Z,\gamma }\Big (\log p(\gamma |\nu ,\Sigma ,B)+\log p(\nu |\mu _{0},V_{0},A,\Phi )\Big )+\mathrm {const}\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (E_{\gamma }\big (\log {\mathcal {N}}(\gamma _{s}^{(t)};B\nu ^{(t)},\Sigma )\big )\Big )+\log p(\nu ^{(1)}|\mu _{0},V_{0})\\&+\, \sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (E_{\gamma }\big (-\dfrac{1}{2}(\gamma _{s}^{(t)})^{\intercal }\Sigma ^{-1}(\gamma _{s}^{(t)})+(\gamma _{s}^{(t)})^{\intercal }\Sigma ^{-1}B\nu ^{(t)}\\&-\,\dfrac{1}{2}(\nu ^{(t)})^{\intercal }B^{\intercal }\Sigma ^{-1}B\nu ^{(t)}\big )\Big )\\&+\, \log p(\nu ^{(1)}|\mu _{0},V_{0})+\sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}.\\= & {} \sum _{t=1}^{T}\Big ( \sum _{s=1}^{S}\Big (\hat{\gamma }_{s}^{(t)}\Sigma ^{-1}B\nu ^{(t)}\Big ) -\dfrac{1}{2}(\nu ^{(t)})^{\intercal }B^{\intercal }(S\Sigma ^{-1})B\nu ^{(t)} \Big ) \\&+\, \log p(\nu ^{(1)}|\mu _{0},V_{0})+\sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}, \end{aligned}$$

where all terms that do not depend on \(\nu \) have been put into the constant terms \(\mathrm {const}\). We recognize the functional form of the posterior distribution of a linear dynamic system:

$$\begin{aligned} \log q(\nu )= & {} \sum _{t=1}^{T}\Big (\log {\mathcal {N}}\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S};B\nu ^{(t)},\dfrac{\Sigma }{S}\Big )\Big )\\&+\, \log p(\nu ^{(1)}|\mu _{0},V_{0})+\sum _{t=2}^{T}\log p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )+\mathrm {const}. \end{aligned}$$

Appendix 3: Derivation of the lower bound

In the following, we denote \(x^{(t)}=\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S}\) an observed variable. The lower bound is given by:

$$\begin{aligned}&\tilde{{\mathcal {L}}}\left( q,\theta ,\xi \right) \\&\quad = \sum _{Z}\int _{\gamma }\int _{\nu }q\left( Z,\gamma ,\nu \right) \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma \right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( Z,\gamma ,\nu \right) } d\nu d\gamma \\&\quad = E_{Z,\gamma ,\nu }\left[ \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma ,B\right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( \gamma \right) q\left( \nu \right) \prod _{i=1}^{N}q\left( Z_{i}\right) }\right] \\&\quad = E_{Z}\left( \log p\left( X|Z,\Pi \right) \right) +E_{Z,\gamma }\left( \log h\left( Z,\gamma ,\xi \right) \right) +E_{\gamma ,\nu }\left( \log p\left( \gamma |\nu ,\Sigma ,B\right) \right) \\&\qquad + E_{\nu }\left( \log p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) \right) -E_{\gamma }\left( \log q\left( \gamma \right) \right) -E_{\nu }\left( \log q\left( \nu \right) \right) \\&\qquad -E_{Z}\left( \log \left( \prod _{i=1}^{N}q\left( Z_{i}\right) \right) \right) . \end{aligned}$$

Note that (see Proposition 3.3),

$$\begin{aligned} q(\nu )\propto p(\nu ^{(1)}|\mu _{0},V_{0})\Big [\prod _{t=2}^{T}p(\nu ^{(t)}|\nu ^{(t-1)},A,\Phi )\Big ]\Big [\prod _{t=1}^{T}{\mathcal {N}}\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S};B\nu ^{(t)},\dfrac{\Sigma }{S}\Big )\Big ]. \end{aligned}$$

As pointed out in this proposition, this corresponds to the form of the posterior distribution associated with a state space model with parameter \(\theta ^{'}\) and with observed outputs \(x=(x^{(t)})_{t}\). If we denote \(p(x|\theta ^{'})\) the likelihood associated with this model, and the joint likelihood \(p(x,\nu |\theta ^{'})\), we have

$$\begin{aligned} q(\nu ) = \frac{p(x,\nu |\theta ^{'})}{ p(x|\theta ^{'})}. \end{aligned}$$

Therefore

$$\begin{aligned}&E_{\nu }\left( \log q\left( \nu \right) \right) =E_{\nu }\left( \log p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) \right) \\&\quad +E_{\nu }\left( \log p\left( \dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{\left( t\right) }}{S}|\nu ^{\left( t\right) },\dfrac{\Sigma }{S},B\right) \right) - \log p\left( x|\theta ^{'}\right) . \end{aligned}$$

This leads to,

$$\begin{aligned}&E_{\nu }\left( \log p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) \right) -E_{\nu }\left( \log q\left( \nu \right) \right) \\&\quad =-E_{\nu }\left( \log p\left( \dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{\left( t\right) }}{S}|\nu ^{\left( t\right) },\dfrac{\Sigma }{S},B\right) \right) +\log p\left( x|\theta ^{'}\right) , \end{aligned}$$

and \(\tilde{{\mathcal {L}}}(q,\theta ,\xi )\) can be written as follows:

$$\begin{aligned} \tilde{{\mathcal {L}}}\left( q,\theta ,\xi \right)= & {} \sum _{Z}\int _{\gamma }\int _{\nu }q\left( Z,\gamma ,\nu \right) \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma \right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( Z,\gamma ,\nu \right) }\\= & {} E_{Z,\gamma ,\nu }\left[ \log \dfrac{p\left( X|Z,\Pi \right) h\left( Z,\gamma ,\xi \right) p\left( \gamma |\nu ,\Sigma ,B\right) p\left( \nu |\mu _{0},A,\Phi ,V_{0}\right) }{q\left( \gamma \right) q\left( \nu \right) \prod _{i=1}^{N}q\left( Z_{i}\right) }\right] \\= & {} E_{Z}\left( \log p\left( X|Z,\Pi \right) \right) +E_{Z,\gamma }\left( \log h\left( Z,\gamma ,\xi \right) \right) +E_{\gamma ,\nu }\left( \log p\left( \gamma |\nu ,\Sigma ,B\right) \right) \\&-\, E_{\gamma }\left( \log q\left( \gamma \right) \right) -E_{\nu }\left( \log p\left( \dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{\left( t\right) }}{S}|\nu ^{\left( t\right) },\dfrac{\Sigma }{S},B\right) \right) -E_{Z}\left( \log \left( \prod _{i=1}^{N}q\left( Z_{i}\right) \right) \right) \\&+\, \log p\left( x|\theta ^{'}\right) . \end{aligned}$$

We explicit below each of the terms of the bound \(\tilde{{\mathcal {L}}}(q,\theta )\).

  1. 1.

    \(E_{Z}(\log p(X|Z,\Pi ))\):

    $$\begin{aligned} E_{Z}(\log p(X|Z,\Pi ))= & {} \sum _{t=1}^{T}\sum _{k,l}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}E_{z}(\delta (X_{ij}^{(t)}=c)Z_{ik}^{(t)}Z_{jl}^{(t)}\log (\Pi _{kl}^{c})\\= & {} \sum _{t=1}^{T}\sum _{k,l}^{K}\sum _{c=0}^{C}\sum _{i\ne j}^{N}\delta (X_{ij}^{(t)}=c)\tau _{ik}^{(t)}\tau _{jl}^{(t)}\log (\Pi _{kl}^{c}) \end{aligned}$$
  2. 2.

    \(E_{Z,\gamma }(\log h(Z,\gamma ,\xi ))\):

    $$\begin{aligned}&E_{Z,\gamma }\Big (\log h(Z,\gamma ,\xi )\Big ) \\&\quad = E_{Z,\gamma }\Big [\sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}\sum _{s=1}^{S} y_{is} Z_{ik}^{(t)}\Big (\gamma _{sk}^{(t)}-\big (\xi _{s}^{-1(t)}\sum _{l}\exp (\gamma _{sk}^{(t)})\\&\qquad - 1+\log (\xi _{s}^{(t)})\big )\Big )\Big ]\\&\quad = \sum _{t=1}^{T}\sum _{k=1}^{K}\sum _{i=1}^{N}\sum _{s=1}^{S} y_{is} \Big (\tau _{ik}^{(t)}\hat{\gamma }_{sk}^{(t)}-\tau _{ik}^{(t)}\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \Big (\hat{\gamma }_{sl}^{(t)}+\dfrac{\hat{\sigma }_{sl}^{2^{(t)}}}{2}\Big )\\&\qquad + \tau _{ik}^{(t)}-\tau _{ik}^{(t)}\log (\xi _{s}^{(t)})\Big )\\&\quad = \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (r_{s}^{(t)}\hat{\gamma }_{sk}^{(t)}-N_{s}\xi _{s}^{-1(t)}\sum _{l=1}^{K}\exp \Big (\hat{\gamma }_{sl}^{(t)}+\dfrac{\hat{\sigma }_{sl}^{2^{(t)}}}{2}\Big )+N_{s}-N_{s}\log (\xi _{s}^{(t)})\Big ) \end{aligned}$$

    where denote \(r_{s}^{(t)}\) is a quantity \(\sum _{i=1}^{N}\tau _{ik}^{(t)}y_{is}\).

  3. 3.

    \(E_{\gamma ,\nu }(\log p(\gamma |\nu ,\Sigma ,B))\):

    $$\begin{aligned} E_{\gamma ,\nu }(\log p(\gamma |\nu ,\Sigma ,B))= & {} E_{\gamma ,\nu }\Big (\log \prod _{t=1}^{T}\prod _{s=1}^{S}{\mathcal {N}}(\gamma _{s}^{(t)};B\nu _{s}^{(t)},\Sigma )\Big )\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\Big (\log {\mathcal {N}}(\hat{\gamma }_{s}^{(t)},B\hat{\nu }_{s}^{(t)},\Sigma )-\dfrac{1}{2}tr(\Sigma ^{-1}B^{T}\hat{V}^{(t)}B)\\&-\,\dfrac{1}{2}tr\Big (\Sigma ^{-1}\hat{\sigma }_{s}^{(t)^{2}}\Big )\Big ) \end{aligned}$$
  4. 4.

    \(E_{\gamma }(\log q(\gamma ))\):

    $$\begin{aligned} E_{\gamma }(\log q(\gamma ))= & {} E_{\gamma }\Big (\prod _{t=1}^{T}\prod _{s=1}^{S}\prod _{k=1}^{K}{\mathcal {N}}(\gamma _{sk}^{(t)};\hat{\gamma }_{sk}^{(t)},\hat{\sigma }_{sk}^{2^{(t)}})\Big )\\= & {} \sum _{t=1}^{T}\sum _{s=1}^{S}\sum _{k=1}^{K} -\log \Big ( (2\pi )^{\frac{1}{2}}\hat{\sigma }_{sk}^{(t)}\Big )-\dfrac{TKS}{2}. \end{aligned}$$
  5. 5.

    \(E_{\nu }\Big (\log p\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S}|\nu ^{(t)},\dfrac{\Sigma }{S},B\Big )\Big )\) :

    $$\begin{aligned}&E_{\nu }\Big (\log p\Big (\dfrac{\sum _{s=1}^{S}\hat{\gamma }_{s}^{(t)}}{S}|\nu ^{(t)},\dfrac{\Sigma }{S},B\Big )\Big )\\&\quad = \sum _{t=1}^{T}\Big (\log {\mathcal {N}}(x^{(t)};B\hat{\nu }^{(t)},\Sigma /S)-\dfrac{1}{2}tr(\Sigma ^{-1}SB^{T}\hat{V}^{(t)}B)\Big ). \end{aligned}$$
  6. 6.

    \(E_{Z}(\log (\prod _{i=1}^{T}q(Z_{i})))\):

$$\begin{aligned} E_{Z}\left( \log \left( \prod _{i=1}^{T}q\left( Z_{i}\right) \right) \right)= & {} \sum _{i=1}^{N}E_{Z}\left( \log q\left( Z_{i}\right) \right) \\= & {} \sum _{i=1}^{N}E_{Z}\left( \sum _{t=1}^{T}\sum _{k=1}^{K}Z_{ik}^{\left( t\right) }\log \left( \tau _{ik}\right) \right) \\= & {} \sum _{i=1}^{N}\sum _{t=1}^{T}\sum _{k=1}^{K}\tau _{ik}^{\left( t\right) }\log \left( \tau _{ik}^{\left( t\right) }\right) . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zreik, R., Latouche, P. & Bouveyron, C. The dynamic random subgraph model for the clustering of evolving networks. Comput Stat 32, 501–533 (2017). https://doi.org/10.1007/s00180-016-0655-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-016-0655-5

Keywords

Navigation