Abstract
We propose a new stochastic block model that focuses on the analysis of interaction lengths in dynamic networks. The model does not rely on a discretization of the time dimension and may be used to analyze networks that evolve continuously over time. The framework relies on a clustering structure on the nodes, whereby two nodes belonging to the same latent group tend to create interactions and non-interactions of similar lengths. We introduce a variational expectation–maximization algorithm to perform inference, and adapt a widely used clustering criterion to perform model choice. Finally, we validate our methodology using simulated data experiments and showing two illustrative applications concerning face-to-face interaction data and a bike sharing network.
Similar content being viewed by others
References
Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9(Sep):1981–2014
Ambroise C, Matias C (2012) New consistent and asymptotically normal parameter estimates for random-graph mixture models. J R Stat Soc Ser B (Stat Methodol) 74(1):3–35
Baudry J, Celeux G (2015) EM for mixtures Initialization requires special care. Stat Comput 25(4):713–726
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725
Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3):561–575
Blei DM, Kucukelbir A, McAuliffe JD (2017) Variational inference: a review for statisticians. J Am Stat Assoc 112(518):859–877
Bouveyron C, Latouche P, Zreik R (2018) The stochastic topic block model for the clustering of vertices in networks with textual edges. Stat Comput 28(1):11–31
Celisse A, Daudin JJ, Pierre L (2012) Consistency of maximum-likelihood and variational estimators in the stochastic block model. Electron J Stat 6:1847–1899
Côme E, Latouche P (2015) Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood. Stat Model 15(6):564–589
Corneli M, Latouche P, Rossi F (2017) Multiple change points detection and clustering in dynamic networks. Stat Comput 28:1–19
Daudin JJ, Picard F, Robin S (2008) A mixture model for random graphs. Stat Comput 18(2):173–183
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38
Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, Berlin
Hanneke S, Fu W, Xing EP (2010) Discrete temporal models of social networks. Electron J Stat 4:585–605
Hoff PD, Raftery AE, Handcock MS (2002) Latent space approaches to social network analysis. J Am Stat Assoc 97(460):1090–1098
Holland PW, Leinhardt S (1981) An exponential family of probability distributions for directed graphs. J Am Stat Assoc 76(373):33–50
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Mastrandrea R, Fournet J, Barrat A (2015) Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PLoS ONE 10(9):1–26
Matias C, Miele V (2017) Statistical clustering of temporal networks through a dynamic stochastic block model. J R Stat Soc Ser B (Stat Methodol) 79(4):1119–1141
Matias C, Rebafka T, Villers F (2018) A semiparametric extension of the stochastic block model for longitudinal networks. Biometrika 105(3):665–680
O’Hagan A, Murphy TB, Gormley IC (2012) Computational aspects of fitting mixture models via the expectation–maximization algorithm. Comput Stat Data Anal 56(12):3843–3864
R Core Team (2017) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna https://www.R-project.org/
Rastelli R (2019) Exact integrated completed likelihood maximisation in a stochastic block transition model for dynamic networks. J French Stat Soc 160(1):35–56
Rastelli R, Latouche P, Friel N (2018) Choosing the number of groups in a latent stochastic blockmodel for dynamic networks. Netw Sci. https://doi.org/10.1017/nws.2018.19 (to appear)
Sarkar P, Moore AW (2005) Dynamic social network analysis using latent space models. SIGKDD Explor Spec Ed Link Min 7:31–40
Scrucca L, Raftery AE (2015) Improved initialisation of model-based clustering using Gaussian hierarchical partitions. Adv Data Anal Classif 9(4):447–460
Sewell DK, Chen Y (2015) Latent space models for dynamic networks. J Am Stat Assoc 110(512):1646–1657
Snijders TAB (2005) Models for longitudinal network data. Models Methods Soc Netw Anal 1:215–247
Stephens M (2000) Dealing with label switching in mixture models. J R Stat Soc Ser B (Stat Methodol) 62(4):795–809
Transport for London (2016) http://cycling.data.tfl.gov.uk/. Accessed 11 Oct 2019
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Wang YJ, Wong GY (1987) Stochastic blockmodels for directed graphs. J Am Stat Assoc 82(397):8–19
Wu CFJ (1983) On the convergence properties of the EM algorithm. Ann Stat 11(1):95–103
Xu K (2015) Stochastic block transition models for dynamic networks. Artif Intell Stat 38:1079–1087
Yang T, Chi Y, Zhu S, Gong Y, Jin R (2011) Detecting communities and their evolutions in dynamic social networks—a Bayesian approach. Mach Learn 82(2):157–189
Žiberna A (2007) Generalized blockmodeling of valued networks. Soc Netw 29(1):105–126
Acknowledgements
The authors would like to thank the editor and the anonymous referees for their valuable comments, which helped in substantially improving the quality of this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Proof of Proposition 1
The evidence lower bound is defined as follows:
We study the terms on the right hand side separately.
The three parts combined give (3).
1.2 Proof of Proposition 2
The evidence lower bound can be rewritten as follows:
Now consider the following Lagrangian:
with multipliers \(\xi _1,\ldots ,\xi _N\). The derivative is equal to the following:
with root:
Regarding the constraints:
This yields the following:
This critical point is a maximum. Using this result in (4) finishes the proof.
1.3 Proof of Proposition 3
Consider the following Lagrangian:
and its derivative:
This gives the root \(\lambda _k = - \sum _{i=1}^N \tau _{ik} / \xi \) and in turn:
which leads to the result of the proposition. This critical point is a maximum.
1.4 Proof of Proposition 4
From (3):
has root \(\mu _{gh} = \frac{ \bar{L}_{\mu _{gh}} }{ \bar{\eta }_{gh} }\) which corresponds to a maximum. The formula for \(\nu _{gh}\) is obtained analogously.
1.5 Proof of Proposition 5
The proof of this Proposition follows closely the proof of Proposition 8 in Daudin et al. (2008). Our model selection criterion is the exact integrated completed log-likelihood, which is defined as:
The first term on the right hand side can be calculated using a BIC-like approximation, as follows:
where \(2K^2\) is the number of components’ parameters and \(\sum _{i\ne j}W_{ij}\) is the number of data points. The second term on the right hand side of (5) can be calculated using the same approximation proposed by Daudin et al. (2008):
Combining the two formulas gives the result in Proposition 5.
Rights and permissions
About this article
Cite this article
Rastelli, R., Fop, M. A stochastic block model for interaction lengths. Adv Data Anal Classif 14, 485–512 (2020). https://doi.org/10.1007/s11634-020-00403-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-020-00403-w
Keywords
- Interaction lengths
- Stochastic block model
- Variational inference
- Integrated completed likelihood
- Social network analysis