Poisson degree corrected dynamic stochastic block model

Riverain, Paul; Fossier, Simon; Nadif, Mohamed

doi:10.1007/s11634-022-00492-9

Poisson degree corrected dynamic stochastic block model

Regular Article
Published: 27 February 2022

Volume 17, pages 135–162, (2023)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Paul Riverain^1,2,3,
Simon Fossier² &
Mohamed Nadif³

406 Accesses
Explore all metrics

Abstract

Stochastic Block Model (SBM) provides a statistical tool for modeling and clustering network data. In this paper, we propose an extension of this model for discrete-time dynamic networks that takes into account the variability in node degrees, allowing us to model a broader class of networks. We develop a probabilistic model that generates temporal graphs with a dynamic cluster structure and time-dependent degree corrections for each node. Thanks to these degree corrections, the nodes can have variable in- and out-degrees, allowing us to model complex cluster structures as well as interactions that decrease or increase over time. We compare the proposed model to a model without degree correction and highlight its advantages in the case of inhomogenous degree distributions in the clusters and in the recovery of unstable cluster dynamics. We propose an inference procedure based on Variational Expectation-Maximization (VEM) that also provides the means to estimate the time-dependent degree corrections. Extensive experiments on simulated and real datasets confirm the benefits of our approach and show the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Conservative and Semiconservative Random Walks: Recurrence and Transience

Article 27 February 2017

Complex Networks: a Mini-review

Article 13 July 2020

Information diffusion modeling and analysis for socially interacting networks

Article 09 January 2021

Notes

We could not compare pdc-dsbm directly to the authors’ algorithm because their R package dynsbm V0.7 only implements Bernoulli, Multinomial and Gaussian distributions, so we had to re-implement it for Poisson distributions.
More noise needs to be added for $M_+$ since margins greater than one spread the classes apart.
http://64.111.127.166/origin-destination/.
https://cycling.data.tfl.gov.uk/.

References

Abbe E (2017) Community detection and stochastic block models: recent developments. J Mach Learn Res 18(1):6446–6531
MathSciNet Google Scholar
Affeldt S, Labiod L, Nadif M (2021) Regularized bi-directional co-clustering. Stat Comput 31(3):1–17
Article MathSciNet MATH Google Scholar
Ailem M, Role F, Nadif M (2017) Model-based co-clustering for the effective handling of sparse data. Pattern Recognit 72:108–122
Article Google Scholar
Ailem M, Role F, Nadif M (2017) Sparse poisson latent block model for document clustering. IEEE Trans Knowl Data Eng 29(7):1563–1576
Article Google Scholar
Airoldi E, Blei D, Fienberg S, Xing E (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014
MATH Google Scholar
Banerjee A, Dhillon I, Ghosh J, Merugu S, Modha DS (2007) A generalized maximum entropy approach to bregman co-clustering and matrix approximation. J Mach Learn Res 8(67):1919–1986
MathSciNet MATH Google Scholar
Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Article MathSciNet MATH Google Scholar
Bartolucci F, Pandolfi S (2020) An exact algorithm for time-dependent variational inference for the dynamic stochastic block model. Pattern Recognit Lett 138:362–369
Article Google Scholar
Benzecri J-P (1973) L’analyse des données, tome 2: l’analyse des correspondances. Dunod, Paris
MATH Google Scholar
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725
Article Google Scholar
Bock H-H (2020) Co-clustering for object by variable data matrices. In: Imaizumi T, Nakayama A, Yokoyama S (eds) Advanced studies in behaviormetrics and data science: essays in Honor of Akinori Okada. Springer Singapore, Singapore, pp 3–17
Chapter Google Scholar
Chi Y, Song X, Zhou D, Hino K, Tseng BL (2007) Evolutionary spectral clustering by incorporating temporal smoothness. In: KDD. Association for Computing Machinery, pp 153–162
Corneli M, Latouche P, Rossi F (2016) Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks. Neurocomputing 192:81–91
Article Google Scholar
Corneli M, Latouche P, Rossi F (2018) Multiple change points detection and clustering in dynamic networks. Stat Comput 28(5):989–1007
Article MathSciNet MATH Google Scholar
Daudin JJ, Picard F, Robin S (2008) A mixture model for random graphs. Stat Comput 18(2):173–183
Article MathSciNet Google Scholar
Fu W, Song L, Xing EP (2009) Dynamic mixed membership blockmodel for evolving networks. In: ICML, pp 329–336
Ghahramani Z, Jordan MI (1997) Factorial hidden Markov models. Mach Learn 29(2–3):245–273
Article MATH Google Scholar
Govaert G, Nadif M (2005) An EM algorithm for the block mixture model. IEEE Trans Pattern Anal Mach Intell 27(4)
Govaert G, Nadif M (2013) Co-clustering: models, algorithms and applications. Wiley, Hoboken
Book MATH Google Scholar
Govaert G, Nadif M (2018) Mutual information, phi-squared and model-based co-clustering for contingency tables. Adv Data Anal Classif 12(3):455–488
Article MathSciNet MATH Google Scholar
Greenacre M (2007) Correspondence analysis in practice. Chapman & Hall/CRC, Boca Raton
Book MATH Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif
Karrer B, Newman ME (2011) Stochastic blockmodels and community structure in networks. Phys Rev E Stat Nonlinear Soft Matter Phys 83(1)
Lin Y-R, Chi Y, Zhu S, Sundaram H, Tseng BL (2009) Analyzing communities and their evolutions in dynamic social networks. ACM Trans Knowl Discovery Data 3(2):1–31
Article Google Scholar
Liu S, Wang S, Krishnan R (2014) Persistent community detection in dynamic social networks. In: Tseng VS, Ho TB, Zhou Z-H, Chen ALP, Kao H-Y (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 78–89
Chapter Google Scholar
Mariadassou M, Robin S, Vacher C (2010) Uncovering latent structure in valued graphs: a variational approach. Ann Appl Stat 4(2):715–742
Article MathSciNet MATH Google Scholar
Matias C, Miele V (2017) Statistical clustering of temporal networks through a dynamic stochastic block model. J R Stat Soc Ser B Stat Methodol 79(4):1119–1141
Article MathSciNet MATH Google Scholar
Matias C, Rebafka T, Villers F (2018) A semiparametric extension of the stochastic block model for longitudinal networks. Biometrika 105(3):665–680
Article MathSciNet MATH Google Scholar
Meng X-L, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80(2):267–278
Article MathSciNet MATH Google Scholar
Neal RM, Hinton GE (1998) A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in graphical models. Springer, Berlin, pp 355–368
Chapter Google Scholar
Qiao M, Yu J, Bian W, Li Q, Tao D (2017) Improving stochastic block models by incorporating power-law degree characteristic. In: IJCAI international joint conference on artificial intelligence, pp 2620–2626
Rastelli R, Latouche P, Friel N (2018) Choosing the number of groups in a latent stochastic blockmodel for dynamic networks. Netw Sci 6(4):469–493
Article Google Scholar
Razaee Z, Amini A, Li JJ (2019) Matched bipartite block model with covariates. J Mach Learn Res 20:1–44
MathSciNet MATH Google Scholar
Salah A, Nadif M (2019) Directional co-clustering. Adv Data Anal Classif 13(3):591–620
Article MathSciNet MATH Google Scholar
Schepers J, Bock H-H, Van Mechelen I (2017) Maximal interaction two-mode clustering. J Classif 34(1):49–75
Article MathSciNet MATH Google Scholar
Sewell DK, Chen Y (2016) Latent space models for dynamic networks with weighted edges. Soc Netw 44:105–116
Article Google Scholar
Snijders T, Nowicki K (1997) Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J Classif 14:75–100
Article MathSciNet MATH Google Scholar
Wang YJ, Wong GY (1987) Stochastic blockmodels for directed graphs. J Am Stat Assoc 82(397):8–19
Article MathSciNet MATH Google Scholar
Xu KS, Hero AO (2014) Dynamic stochastic blockmodels for time-evolving social networks. IEEE J Sel Top Signal Process 8(4):552–562
Article Google Scholar
Yang T, Chi Y, Zhu S, Gong Y, Jin R (2011) Detecting communities and their evolutions in dynamic social networks—a Bayesian approach. Mach Learn 82(2):157–189
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank the three anonymous reviewers for their detailed comments that have helped us a lot to improve this manuscript.

Author information

Authors and Affiliations

Université de Paris, CNRS, Centre Borelli UMR 9010, Thales Research and Technology France, Paris, France
Paul Riverain
Thales Research and Technology France, 1 Avenue Augustin Fresnel, 91120, Palaiseau, France
Paul Riverain & Simon Fossier
Université de Paris, CNRS, Centre Borelli UMR 9010, 45 rue des Saints Pères, 75006, Paris, France
Paul Riverain & Mohamed Nadif

Authors

Paul Riverain
View author publications
You can also search for this author in PubMed Google Scholar
Simon Fossier
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Nadif
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul Riverain.

Ethics declarations

Conflict of intrest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Derivation of the objective criterion (6)

We derive the criterion (6) in the case of a constant number of nodes ($\forall t, \, V^t = V$), the other case easily follows. Let Q be a probability over the space of complete data $\mathcal {Z}$, i.e. the set of all possible latent trajectories for N nodes, over K possible states and T time steps. From Neal and Hinton (1998), we have:

$$\begin{aligned} \ell ({\varvec{\theta }}) \ge F({\varvec{q}}, {\varvec{\theta }})&= \ell ({\varvec{\theta }}) - KL(Q||P(.|{\varvec{X}}, {\varvec{\theta }})) \\&= \mathbb {E}_Q(\log P({\varvec{X}}, {\varvec{Z}}; {\varvec{\theta }})) + \mathbb {H}(Q) \\&= \mathbb {E}_Q(\log P({\varvec{X}}, {\varvec{Z}}; {\varvec{\theta }}) - \log Q({\varvec{Z}}; {\varvec{q}})). \end{aligned}$$

Let Q factorize as N independent inhomogeneous Markov models:

$$\begin{aligned} Q({\varvec{Z}}; {\varvec{q}})&= \prod _{i} Q(Z_{i}^{1}; {\varvec{q}}) \prod _{t \ge 2} Q(Z_{i}^{t}|Z_{i}^{t-1}; {\varvec{q}})\\&= \prod _{ik}q(i, k)^{Z_{ik}^1} \prod _{t \ge 2}\prod _{\ell } q(t, i, k, \ell )^{Z_{ik}^{t-1} Z_{i\ell }^t} \end{aligned}$$

where Q is parameterized by ${\varvec{q}}= \Big (\big (q(i, k)\big )_{ik}, \big (q(t, i, k, \ell )\big )_{tikl}\Big )$, with $q(i, k) = Q(Z_{ik}^{1}=1)$, $q(t, i, k, \ell ) = Q(Z_{i\ell }^{t}=1 | Z_{ik}^{t-1}=1)$.

First, from the variational distributions, we have:

$$\begin{aligned} \mathbb {E}_Q(\log Q({\varvec{Z}}; {\varvec{q}})) =&\sum _{ik} \mathbb {E}_Q(Z_{ik}^1) \log q(i, k) \nonumber \\&+ \sum _{t \ge 2} \sum _{ik\ell } \mathbb {E}_Q(Z_{ik}^{t-1} Z_{i\ell }^t) \log q(t,i,k,\ell ). \end{aligned}$$

(14)

Secondly, from the model, we have:

$$\begin{aligned} \mathbb {E}_Q(\log P({\varvec{X}}, {\varvec{Z}}; {\varvec{\theta }})) =&\sum _{ik} \mathbb {E}_Q(Z_{ik}^1) \log \alpha _k + \sum _{t \ge 2}\sum _{ik\ell } \mathbb {E}_Q(Z_{ik}^{t-1} Z_{i\ell }^t) \log \pi _{k\ell } \nonumber \\&+ \sum _{t}\sum _{i \ne j}\sum _{k\ell } \mathbb {E}_Q(Z_{ik}^{t} Z_{j\ell }^{t}) \log \phi (X_{ij}^t; \mu _i^t \nu _j^t \gamma _{k\ell }). \end{aligned}$$

(15)

To develop $F({\varvec{q}}, {\varvec{\theta }})$ we rely on the following Lemma.

Lemma 1

We have the following equalities:

$$\begin{aligned}&\mathbb {E}_Q(Z_{ik}^1) = q(i, k) , \end{aligned}$$

(16a)

$$\begin{aligned}&\mathbb {E}_Q(Z_{ik}^{t-1} Z_{i\ell }^t) = q(t - 1, i, k)q(t, i, k, \ell ) , \end{aligned}$$

(16b)

$$\begin{aligned}&\forall i \ne j, \; \mathbb {E}_Q(Z_{ik}^{t} Z_{j\ell }^t) = q(t, i, k)q(t, j, \ell ). \end{aligned}$$

(16c)

Thereby, from the expressions of (14, 15) and Lemma 1, the variational lower-bound of the log-likelihood of the model is given by:

$$\begin{aligned} F({\varvec{q}}, {\varvec{\theta }})&= \mathbb {E}_Q(\log P({\varvec{X}}, {\varvec{Z}}; {\varvec{\theta }}) - \log Q({\varvec{Z}}; {\varvec{q}}))\\&= \sum _{ik} q(i,k) \log \alpha _k + \sum _{t \ge 2}\sum _{ik\ell } q(t-1,i,k) q(t,i,k,\ell ) \log \pi _{k\ell } \\&\quad + \sum _{t}\sum _{i \ne j}\sum _{k\ell } q(t,i,k) q(t,j,\ell )\log \phi (X_{ij}^t; \mu _i^t \nu _j^t \gamma _{k\ell }) \\&\quad - \sum _{ik} q(i, k) \log q(i, k) - \sum _{t \ge 2} \sum _{ik\ell } q(t-1,i,k) q(t,i,k,\ell ) \log q(t,i,k,\ell ). \end{aligned}$$

Proof

(of Lemma 1) As the Markov chains ${\varvec{Z}}_i$ and ${\varvec{Z}}_j$ are independent for the distribution Q, we have to prove that $\mathbb {E}_Q(Z_{ik}^{t}) = q(t, i, k)$. The proof for (16a) and (16b) are analogous.

In the paper, the latent processes are defined over the index set $\{1, \ldots , T\}$. In the following, we consider a virtual source cluster $k_s$ at virtual time step $t = 0$ from which every node starts. Let $\mathcal {Z}_i^{(t, t')}(k)$ be all the possible latent trajectories for node i over $t' - t$ time steps, starting at cluster k at time t:

$$\begin{aligned} \mathcal {Z}_i^{(t, t')}(k) = \{{\varvec{Z}}_i \in \{0, 1\}^{(t' - t + 1)K}|&{\varvec{Z}}_i = ({\varvec{Z}}_i^t, \ldots , {\varvec{Z}}_i^{t'})^\intercal , Z_{ik}^{t} = 1 \wedge \forall \tau , \, \sum _k Z_{ik}^{\tau } = 1 \}. \end{aligned}$$

For $t' \le t$ and $k' \in \{1, \ldots , K\}$, we define $\mathcal {Z}_i^{(t, t')}(k, \tau , k')$, the set of all paths of $\mathcal {Z}_i^{(t, t')}(k)$, that pass through cluster $k'$ at time step $\tau $:

$$\begin{aligned} \mathcal {Z}_i^{(t, t')}(k, \tau , k') = \{{\varvec{Z}}_i \in \mathcal {Z}_i^{(t, t')}(k) | Z_{ik'}^{\tau } = 1\}. \end{aligned}$$

Let $Q^i$ be the distribution for node i. As the N chains are independent:

$$\begin{aligned} \mathbb {E}_Q(Z_{ik}^{t}) = \mathbb {E}_{Q^{i}}(Z_{ik}^{t}) = Q^{i}(Z_{ik}^{t}=1) = \sum _{{\varvec{Z}}\in \mathcal {Z}_i^{(0, T)}(k_s, t, k)} Q^{i}({\varvec{Z}}). \end{aligned}$$

As $\mathcal {Z}_i^{(0, T)}(k_s, t, k)$ decomposes as $\mathcal {Z}_i^{(0, t-1)}(k_s) \times \mathcal {Z}_i^{(t, T)}(k)$. In the following, we identify the elements of the sets with their index ($(k_0^{(c)}, \ldots , k_T^{(c)})$ is the cth element of $\mathcal {Z}_i^{(0, T)}(k)$). For consistency with the notations, we define $q(1, i, k_s, k') = q(i, k')$ the transition probability from virtual cluster $k_s$ at $t=0$. We can then write:

$$\begin{aligned} \mathbb {E}_Q(Z_{ik}^{t})&= \sum _{{\varvec{Z}}\in \mathcal {Z}_i^{(T)}(s, t, k)} Q^{i}({\varvec{Z}}) \\&= \sum _{c \in \mathcal {Z}_i^{(T)}(s, t, k)}q(1, i, k_0^{(c)}, k_1^{(c)}) q(2, i, k_1^{(c)}, k_2^{(c)}) \ldots q(T, i, k_{T-1}^{(c)}, k_T^{(c)}) \\&= \sum _{c' \in \mathcal {Z}_i^{(0, t-1)}(k_s)} \sum _{c'' \in \mathcal {Z}_i^{(t, T)}(k)} \bigg ( q(1, i, k_0^{(c')}, k_1^{(c')}) \ldots q(t, i, k_{t-1}^{(c')}, k) \\&\quad \times q(t+1, i, k, k_{1}^{(c'')}) \ldots q(T, i, k_{T-t-1}^{(c'')}, k_{T-t}^{(c'')}) \bigg ) \\&= \bigg (\sum _{c' \in \mathcal {Z}_i^{(0, t-1)}(k_s)} q(1, i, k_0^{(c')}, k_1^{(c')}) \ldots q(t, i, k_{t-1}^{(c')}, k)\bigg ) \\&\quad \times \bigg (\sum _{c'' \in \mathcal {Z}_i^{(t, T)}(k)} q(t+1, i, k, k_{1}^{(c'')}) \ldots q(T, i, k_{T-t-1}^{(c'')}, k_{T-t}^{(c'')})\bigg ). \end{aligned}$$

The second sum in the last equation corresponds to summing over all possible paths in a chain of length $T-t$ starting at cluster k, so it equals one. Now, recall that:

$$\begin{aligned} q(t, i, k)&= \sum _{k'} q(t-1, i, k') q(t, i, k', k) \\&= \sum _{k_1, \ldots , k_{t-1}} q(i, k_1) q(2, i, k_1, k_2) \ldots q(t, i, k_{t-1}, k) \\&= \sum _{c' \in \mathcal {Z}_i^{(0, t-1)}(k_s)} q(1, i, k_0^{(c')}, k_1^{(c')}) q(2, i, k_1^{(c')}, k_2^{(c')}) \ldots q(t, i, k_{t-1}^{(c')}, k). \end{aligned}$$

This concludes the proof. $\square $

Derivation of the expectation step

Here, we present a way to derive the proposed formulae in E-step for a fixed set of nodes (i.e. $\forall t,\; V^t = V$). The results when considering a variable number of nodes easily follows.

As proposed in Bartolucci and Pandolfi (2020), the true VE step can be realized but is computationally heavy. In fact, in order to optimize $F({\varvec{q}}, {\varvec{\theta }})$ w.r.t. $q(t, i, k, \ell )$, we notice that every $q(t', i, k')$ with $t' \ge t$ depends on $q(t, i, k, \ell )$. Here, we instead propose a VE step that increases $F({\varvec{q}}, {\varvec{\theta }})$ w.r.t. the variational parameters ${\varvec{q}}$.

We consider the variational parameters ${\varvec{q}}(i) = \Big (\big (q(i, k)\big )_{k}, \big (q(t, i, k, \ell )\big )_{tkl}\Big )$ as well as auxiliary variables ${\varvec{q}}_m^t(i) = \big (q(t, i, k)\big )_{ik}$ for the marginal probabilities, where $q(t, i, k) = Q(Z_{ik}^{t}=1)$.

We first note that F can be decomposed over each node and cluster thanks to the variational approximation: $F({\varvec{q}}, {\varvec{\theta }}) = \sum _{i\ell } F_{i\ell } ({\varvec{q}}(i), {\varvec{q}}_m(-i), {\varvec{\theta }})$ where ${\varvec{q}}_m(-i) = \big ({\varvec{q}}_m^1(j), \ldots , {\varvec{q}}_m^T(j)\big )_{j \ne i}$ and

$$\begin{aligned}&F_{i\ell } ({\varvec{q}}(i), {\varvec{q}}_m(-i), {\varvec{\theta }}) = q(i,\ell ) \log \frac{\alpha _\ell }{q(i, \ell )}\\&\quad + \sum _{t \ge 2}\sum _{k} q(t-1,i,k) q(t,i,k,\ell ) \log \frac{\pi _{k\ell }}{q(t,i,k,\ell )}\\&\quad + \sum _{t} \bigg ( q(t,i,\ell ) \sum _{j \ne i} \sum _{k} q(t,j,k) \log \phi _{ij\ell k}^{t} + q(t,i,\ell ) \sum _{j \ne i} \sum _{k} q(t,j,k) \log \phi _{jik\ell }^{t} \bigg ) \\&\quad = q(i,\ell ) \log \frac{\alpha _\ell }{q(i, \ell )}\\&\quad + \sum _{t \ge 2}\sum _{k} q(t-1,i,k) q(t,i,k,\ell ) \log \frac{\pi _{k\ell }}{q(t,i,k,\ell )} \\&\quad + \sum _{t} q(t, i,\ell ) \sum _{j \ne i} \sum _{k} q(t, j,k) \log \Phi _{ij\ell k}^{t} \end{aligned}$$

where we note $\Phi _{ijk\ell }^{t} = \phi _{ijk\ell }^{t}\phi _{ji\ell k}^{t}$.

For constant marginal probabilities ${\varvec{q}}_m(-i)$, we optimize

$$\begin{aligned} F_{i\ell } (({\varvec{q}}^1(i),{\varvec{q}}_m^1(i)) \ldots , ({\varvec{q}}_m^T(i), {\varvec{q}}^T(i)) | {\varvec{q}}_m^t(-i), {\varvec{\theta }}) \end{aligned}$$

by applying a single step of coordinate ascent on each coordinate $({\varvec{q}}^t(i),{\varvec{q}}_m^t(i))$. When applying this procedure, the other coordinates ($({\varvec{q}}^{-t}(i),{\varvec{q}}_m^{-t}(i))$) are considered constant. We apply this procedure sequentially, for t in $\{1, \ldots , T\}$, and update the marginal probabilities q(t, i, k) with the obtained transition probabilities $q(t, i, k, \ell )$ at each time step.

The formula for E-step can be obtained as follows. Since $q(t, i, k) = \sum _{k'} q(t-1, i, k') q(t, i, k', k)$, $({\varvec{q}}^t(i),{\varvec{q}}_m^t(i))$ only depends on ${\varvec{q}}^t(i)$. For $t \ge 2$, we can write:

$$\begin{aligned}&F_{i\ell }\big (q(t, i, 1, \ell ), \ldots , q(t, i, K, \ell )| {\varvec{q}}^{-t}(i),{\varvec{q}}_m^{-t}(i), {\varvec{q}}_m(-i), {\varvec{\theta }}\big )\\&\quad = \sum _{k} q(t-1,i,k) q(t,i,k,\ell ) \log \frac{\pi _{k\ell }}{q(t,i,k,\ell )}\\&\qquad + \sum _{k} q(t,i,k) q(t+1,i,k,\ell ) \log \frac{\pi _{k\ell }}{q(t+1,i,k,\ell )}\\&\qquad + q(t, i,\ell ) \sum _{j \ne i} \sum _{k} q(t, j,k) \log \Phi _{ij\ell k}^{t}. \end{aligned}$$

Let $\mathcal {L}({\varvec{q}}^t(i), \lambda )$ be the Lagrangian of the constrained optimization problem:

$$\begin{aligned}&\mathcal {L}(q(t, i, 1, \ell ), \ldots , q(t, i, K, \ell ), \lambda )\\&\quad = F_{i\ell }\big (q(t, i, 1, \ell ), \ldots , q(t, i, K, \ell )| {\varvec{q}}^{-t}(i),{\varvec{q}}_m^{-t}(i), {\varvec{q}}_m(-i), {\varvec{\theta }}\big )\\&\qquad + \lambda \big (1 - \sum _{\ell '}q(t, i, k, \ell ')\big ) \end{aligned}$$

For $(q(t', i, k))_{k \in \{1, \ldots , K\}, t'\ne t}$ constant and $s \in \{1, \ldots , T\}$, we have:

$$\begin{aligned} \frac{\partial {q(s, i, k')}}{\partial {q(t, i, k, \ell )}}&= \mathbb {1}(s=t) \frac{\partial {}}{\partial {q(t, i, k, \ell )}} \sum _{\ell '} q(t-1, i, \ell ') q(t, i, \ell ', k') \\&\quad = \mathbb {1}(s=t)\mathbb {1}(k'=\ell ) q(t-1, i, k) \end{aligned}$$

and

$$\begin{aligned}&\frac{\partial {}}{\partial {q(t, i, k, \ell )}} \sum _{k} q(t,i,k) q(t+1,i,k,\ell ) \log \frac{\pi _{k\ell }}{q(t+1,i,k,\ell )} \\&\quad = -q(t-1,i,k) \sum _{\ell '} q(t+1,i,\ell ,\ell ') \big ( \log q(t+1, i, \ell , \ell ') - \log (\pi _{\ell \ell '})\\&\quad = -q(t-1,i,k) D_{\text {KL}}({\varvec{q}}(t+1,i,\ell , :) || {\varvec{\pi }}_{\ell , :}) \end{aligned}$$

where ${\varvec{q}}(t+1,i,\ell , :) = (q(t+1,i,\ell , 1), \ldots , q(t+1,i,\ell , K))^\intercal $ and ${\varvec{\pi }}_{\ell , :} = \big (\pi _{1\ell }, \ldots , \pi _{K\ell }\big )^\intercal $. Let ${d_{ik}^t = D_{\text {KL}}({\varvec{q}}(t,i,k, :) || {\varvec{\pi }}_{k, :})}$. We then have:

$$\begin{aligned}&\frac{\partial {}}{\partial {q(t, i, k, \ell )}} F_{i\ell }\big (q(t, i, 1, \ell ), \ldots , q(t, i, K, \ell )| {\varvec{q}}^{-t}(i),{\varvec{q}}_m^{-t}(i), {\varvec{q}}_m(-i), {\varvec{\theta }}\big )\\&\quad = q(t-1, i, k) \big (\log \pi _{k\ell } - d_{i\ell }^{t+1} - 1 - \log q(t, i, k, \ell )\big ) \\&\qquad + \frac{\partial {}}{\partial {q(t, i, k, \ell )}} \sum _{s} q(s,i,\ell ) \sum _{j \ne i} \sum _{\ell '} q(s,j,\ell ') \log \Phi _{ij\ell \ell '}^{s} \\&\quad = q(t-1, i, k) \big ( \log \pi _{k\ell } - d_{i\ell }^{t+1} - 1 - \log q(t, i, k, \ell ) + \sum _{j \ne i} \sum _{\ell '} q(t, j, \ell ') \log \Phi _{ij\ell \ell '}^t \big ). \end{aligned}$$

Setting the derivative of the Lagrangian to zero, we have:

$$\begin{aligned} \log q(t, i, k, \ell ) = - \frac{\lambda }{q(t-1, i, k)} - 1 + \log \pi _{k\ell } - d_{i\ell }^{t+1} + \sum _{j \ne i} \sum _{\ell '} q(t, j, \ell ') \log \Phi _{ij\ell \ell '}^t. \end{aligned}$$

Thus, $q(t, i, k, \ell ) \propto \pi _{k\ell } \exp (- d_{i\ell }^{t+1}) \prod _{j \ne i} \prod _{\ell '} {\Phi _{ij\ell \ell '}^{t}}^{q(t, j, \ell ')}$. This justifies the proposed formula. We can note that contrary to Matias and Miele (2017), this formula includes a penalty term $\exp (- d_{i\ell }^{t+1})$ to the mixture proportions. In Matias and Miele (2017), the formula for E-step seems to be an approximation of this formula. In our experiments, we observed that our formula gives better clustering results when the data has many cluster transitions (${\varvec{\pi }}$ has low trace) without smoothing the margins, but comparable results when smoothing the margins.

Derivation of the M-step

To update the parameters in the maximization step, we increase $F({\varvec{q}}, {\varvec{\theta }})$ w.r.t. ${\varvec{\theta }}$ by maximizing F for each parameter, conditionally on the others. We first update the mixture proportions ${\varvec{\alpha }}$ and ${\varvec{\pi }}$, since they only depend on ${\varvec{q}}$. Next, we update ${\varvec{\gamma }}$, then ${\varvec{\mu }}$ and finally ${\varvec{\nu }}$. The updates (8a, 8b) with respect to ${\varvec{\alpha }}$ and ${\varvec{\pi }}$ are direct. Concerning ${\varvec{\mu }}$, ${\varvec{\nu }}$ and ${\varvec{\gamma }}$, the lower-bound on the log-likelihood of the model is:

$$\begin{aligned} F({\varvec{q}}, {\varvec{\theta }})&= \sum _{t}\sum _{\begin{array}{c} ij\\ i \ne j \end{array}}\sum _{k\ell } q(t,i,k) q(t,j,\ell ) \log \phi _{ijk\ell }^t + \text {const} \nonumber \\&= \sum _{t}\sum _{\begin{array}{c} ij\\ i \ne j \end{array}}\sum _{k\ell } q(t,i,k) q(t,j,\ell ) \big (\mu _i^t \nu _j^t \gamma _{k\ell } - X_{ij}^t \log \phi (X_{ij}^t; \mu _i^t \nu _j^t \gamma _{k\ell }) \big ) + \text {const}.\nonumber \\ \end{aligned}$$

(17)

By computing the derivative of (17) w.r.t. $\mu _i^t$, $\nu _j^t$ and $\gamma _{k\ell }$ and setting it to zero we obtain the maximization step in (8d, 8e, 8c).

Model selection with the ICL criterion

In order to choose the appropriate number of clusters K we considered the Integrated Classification Likelihood Biernacki et al. (2000), as proposed in Daudin et al. (2008) for the static SBM and in Corneli et al. (2016); Matias and Miele (2017); Rastelli et al. (2018) for dynamic models based on the SBM. The ICL criterion for a model $M_K$ with K clusters is defined as:

$$\begin{aligned} ICL(M_K)&= \log P({\varvec{X}}, {\varvec{Z}}|M_K) = \int _{{\varvec{\varTheta }}} P({\varvec{X, Z}}|{\varvec{\theta }}, M_K) g({\varvec{\theta }}|M_K)\,d{\varvec{\theta }}, \end{aligned}$$

(18)

where ${\varvec{\theta }}= ({\varvec{\alpha }}, {\varvec{\pi }}, {\varvec{\mu }}, {\varvec{\nu }}, {\varvec{\gamma }}) \in {\varvec{\varTheta }}$, ${\varvec{\varTheta }} = A_K \times A_K^K \times \mathbb {R}^{+TN} \times \mathbb {R}^{+TN} \times \mathbb {R}^{+TK^2}$, $A_K$ is the K-dimensional simplex and g is the density of the prior distribution on ${\varvec{\varTheta }}$.

Let $g_{\pi _k}({\varvec{\pi }}_k|M_{K}) = \frac{1}{B(\delta , \ldots , \delta )}\prod _{k'} \pi _{kk'}^{\delta - 1}$ be a prior on ${\varvec{\pi }}_k$, the kth row of ${\varvec{\pi }}$.

$$\begin{aligned} \log P({\varvec{Z}}|M_K) =&\log \int _{A_K} \frac{1}{B(\delta , \ldots , \delta )} \alpha _1^{Z_{.1}^1 + \delta - 1}\dots \alpha _K^{Z_{.K} + \delta - 1} \, d{\varvec{\alpha }}\\&+ \log \int _{A_K^K} \prod _{k} \frac{1}{B(\delta , \ldots , \delta )} \pi _{k1}^{n_{k1} + \delta - 1} \dots \pi _{kK}^{n_{kK} + \delta - 1} \,d{\varvec{\pi }}\\ =&I_\alpha + I_\pi \end{aligned}$$

where $n_{kk'}^z = \sum _{t \ge 2}\sum _i Z_{ik}^{t-1}Z_{ik'}^{t}$ and $I_\alpha $ is computed as in Daudin et al. (2008).

$$\begin{aligned} I_\pi&= \log \int _{A_g^g} \prod _{k} \frac{1}{B(\delta , \ldots , \delta )} \pi _{k1}^{n_{k1}^z + \delta - 1} \dots \pi _{kg}^{n_{kg}^z + \delta - 1} \,d{\varvec{\pi }}\\&= \log \prod _{k} \frac{1}{B(\delta , \ldots , \delta )} \int _{A_K} \pi _{k1}^{n_{k1}^z + \delta - 1} \dots \pi _{kg}^{n_{kK}^z + \delta - 1} \,d{\varvec{\pi }}_k\\&= \sum _k \log \Big ( \frac{B(n_{k1}^z + \delta , \ldots , n_{kK}^z + \delta )}{B(\delta , \ldots , \delta )} \Big ) \\&= g \log \varGamma (\delta g) - g^2 \log \varGamma (\delta ) - \sum _k \log \varGamma (n_{k.}^z + K\delta ) + \sum _{kk'} \log \varGamma (n_{kk'}^z + \delta ) \end{aligned}$$

We use Stirling’s formula $\log \varGamma (x) \approx (x - \frac{1}{2}) \log (x - 1) - (x - 1) + \frac{1}{2} \log \pi $, which is even valid for small values of x. Thus, Stirling’s formula for $\log \varGamma (n_{kk'}^z + \delta )$ remains valid with small values of $n_{kk'}^z$. Following Biernacki et al. (2000), it can be shown that, assuming $K = o(N)$ and removing terms in O(1) (since the error term of the BIC is O(1)):

$$\begin{aligned} \log P({\varvec{Z}}|M_K) =&- \frac{K-1}{2} \log N + \sum _{k} Z_{.k}^1 \log \frac{Z_{.k}^1}{N} \\&- \frac{K-1}{2} \sum _k \log n_{k.}^z +\sum _{kk'} n_{kk'}^z \log \frac{n_{kk'}^z}{n_{k.}^z}, \end{aligned}$$

where $n_{k.}^z = \sum _{k'} n_{kk'}^z$ and $Z_{.k}^1 = \sum _i Z_{ik}^1$. Using the hypothesis $n_{k.}^z = \frac{N(T-1)}{K}$, we have $\sum _k \log n_{k.}^z = K\log N(T-1) + o(N)$. Replacing ${\varvec{Z}}$ by $\widehat{{\varvec{Z}}}$, the estimated partition, we obtain:

$$\begin{aligned} ICL(K) \approx&\max _{{\varvec{\theta }}} \log P({\varvec{X}}, \widehat{{\varvec{Z}}}|{\varvec{\theta }}, M_{K}) - \frac{K - 1}{2} \log N \\&- \frac{K(K - 1)}{2}\log N(T-1) - \frac{K^2 + 2TN}{2} \log (TN(N-1)). \end{aligned}$$

The term $\frac{K - 1}{2} \log N$ is due to the estimated parameter ${\varvec{\alpha }}$. In Matias and Miele (2017), the parameter ${\varvec{\alpha }}$ is not estimated and is considered to be equal to the stationary distribution of ${\varvec{\pi }}$. Omitting the term due to ${\varvec{\alpha }}$ in the proposed ICL results in the same ICL as proposed in Matias and Miele (2017).

We note that we have no guarantee that the assumption of Dirichlet priors for each row of ${\varvec{\pi }}$ with Jeffrey’s uninformative priors is a good choice. In fact, with this dynamic model, we are interested in partitions that are relatively stable through time, which implies that ${\varvec{\pi }}$ should be diagonally dominant. Thus, contrary to mixture proportions in mixture models, some dimensions of the simplex should be preferred by the prior for the rows of ${\varvec{\pi }}$, such that ${\varvec{\pi }}_k$, the kth row of ${\varvec{\pi }}$, could have a prior in the form $\text {Dir}({\varvec{\delta }}_k)$, with $\delta _{k\ell } = \delta _0$ if $k \ne \ell $ and $\delta _{kk} = \delta _{\text {diag}}$, where $\delta _{\text {diag}} > \delta _0$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Riverain, P., Fossier, S. & Nadif, M. Poisson degree corrected dynamic stochastic block model. Adv Data Anal Classif 17, 135–162 (2023). https://doi.org/10.1007/s11634-022-00492-9

Download citation

Received: 13 July 2021
Revised: 26 January 2022
Accepted: 29 January 2022
Published: 27 February 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s11634-022-00492-9

Keywords

Mathematics Subject Classification

62H30

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Poisson degree corrected dynamic stochastic block model

Abstract

Access this article

Similar content being viewed by others

Conservative and Semiconservative Random Walks: Recurrence and Transience

Complex Networks: a Mini-review

Information diffusion modeling and analysis for socially interacting networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of intrest

Additional information

Publisher's Note

Appendices

Derivation of the objective criterion (6)

Lemma 1

Proof

Derivation of the expectation step

Derivation of the M-step

Model selection with the ICL criterion

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Poisson degree corrected dynamic stochastic block model

Abstract

Access this article

Similar content being viewed by others

Conservative and Semiconservative Random Walks: Recurrence and Transience

Complex Networks: a Mini-review

Information diffusion modeling and analysis for socially interacting networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of intrest

Additional information

Publisher's Note

Appendices

Derivation of the objective criterion (6)

Lemma 1

Proof

Derivation of the expectation step

Derivation of the M-step

Model selection with the ICL criterion

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation