Abstract
W-graph refers to a general class of random graph models that can be seen as a random graph limit. It is characterized by both its graphon function and its motif frequencies. In this paper, relying on an existing variational Bayes algorithm for the stochastic block models (SBMs) along with the corresponding weights for model averaging, we derive an estimate of the graphon function as an average of SBMs with increasing number of blocks. In the same framework, we derive the variational posterior frequency of any motif. A simulation study and an illustration on a social network complete our work.
Similar content being viewed by others
References
Airoldi, E.M., Costa, T.B., Chan, S.H.: Stochastic blockmodel approximation of a graphon: theory and consistent estimation. Adv. Neural Inf. Process. Syst. 692–700 (2013)
Asta, D., Shalizi, C.R.: Geometric Network Comparison. Technical report (2014). arXiv:1411.1350v1
Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)
Barbour, A., Reinert, G.: Discrete small world networks. Electron. J. Probab. 11(47), 1234–1283 (2006)
Beal, J.M., Ghahramani, Z.: The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Bayesian Stat. 7, 543–552 (2003)
Bhattacharyya, S., Bickel, P.J.: Subsampling bootstrap of count features of networks. Ann. Stat. 43(6), 2384–2411 (2015)
Bickel, P., Chen, A.: A non parametric view of network models and Newman–Girvan and other modularities. Proc. Natl Acad. Sci. USA 106, 21068–21073 (2009)
Bickel, P., Chen, A., Levina, E.: The method of moments and degree distributions for network models. Ann. Stat. 39(5), 2280–2301 (2011)
Bollobás, B., Janson, S., Riordan, O.: The phase transition in inhomogeneous random graphs. Random Struct. Algorithms 31(1), 3–122 (2007)
Borgs, C., Chayes, J., Cohn, H., Ganguly, S.: Consistent Nonparametric Estimation for Heavy-Tailed Sparse Graphs. Technical report (2015). arXiv:1508.06675
Celisse, A., Daudin, J.-J., Pierre, L.: Consistency of maximum-likelihood and variational estimators in the stochastic block model. Electron. J. Stat. 6, 1847–1899 (2012)
Chan, S., Airoldi, E.: A consistent histogram estimator for exchangeable graph models. J. Mach. Learn. Res. Conf. Proc. 32, 208–216 (2014)
Channarond, A., Daudin, J.-J., Robin, S.: Classification and estimation in the stochastic block model based on the empirical degrees. Electron. J. Stat. 6, 2574–2601 (2012)
Chatterjee, S.: Matrix estimation by universal singular value thresholding. Ann. Stat. 43(1), 177–214 (2015)
Daudin, J.-J., Picard, F., Robin, S.: A mixture model for random graphs. Stat. Comput. 18(2), 173–183 (2008)
Diaconis, P., Janson, S.: Graph limits and exchangeable random graphs. Rend. Mat. Appl. 7(28), 33–61 (2008)
Gazal, S., Daudin, J.-J., Robin, S.: Accuracy of variational estimates for random graph mixture models. J. Stat. Comput. Simul. 82(6), 849–862 (2012)
Girvan, M., Newman, M.: Community structure in social and biological networks. Proc. Natl Acad. Sci. USA 99(12), 7821 (2002)
Gouda, A., Szántai, T.: On numerical calculation of probabilities according to Dirichlet distribution. Ann. Oper. Res. 177, 185–200 (2010). doi:10.1007/s10479-009-0601-9
Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T.: Bayesian model averaging: a tutorial. Stat. Sci. 14(4), 382–417 (1999)
Hoff, P.: Modeling homophily and stochastic equivalence in symmetric relational data. Adv. Neural Inf. Process. Syst. 20, 657–664 (2008)
Kallenberg, O.: Multivariate sampling and the estimation problem for exchangeable arrays. J. Theor. Probab. 12(3), 859–883 (1999)
Latouche, P., Birmelé, E., Ambroise, C.: Overlapping stochastic block models with application to the French political blogosphere. Ann. Appl. Stat. 5(1), 309–336 (2011)
Latouche, P., Birmelé, E., Ambroise, C.: Variational Bayesian inference and complexity control for stochastic block models. Stat. Model. 12(1), 93–115 (2012)
Lloyd, J., Orbanz, P., Ghahramani, Z., Roy, D.: Random function priors for exchangeable arrays with applications to graphs and relational data. Adv. Neural Inf. Process. Syst. 998–1006 (2012)
Lovász, L., Szegedy, B.: Limits of dense graph sequences. J. Comb. Theory B 96(6), 933–957 (2006)
Mariadassou, M., Robin, S., Vacher, C.: Uncovering latent structure in valued graphs: a variational approach. Ann. Appl. Stat. 4(2), 715–742 (2010)
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Networks motifs: simple building blocks of complex networks. Science 298, 824–827 (2002)
Nowicki, K., Snijders, T.: Estimation and prediction for stochastic block-structures. J. Am. Stat. Assoc. 96, 1077–1087 (2001)
Palla, G., Lovasz, L., Vicsek, T.: Multifractal network generator. Proc. Natl Acad. Sci. USA 107(17), 7640–7645 (2010)
Picard, F., Daudin, J.-J., Koskas, M., Schbath, S., Robin, S.: Assessing the exceptionality of network motifs. J. Comput. Biol. 15(1), 1–20 (2008)
Robins, G., Pattison, P., Kalish, Y., Lusher, D.: An introduction to exponential random graph models for social networks. Soc. Netw. 29, 173–191 (2007)
Stark, D.: Compound Poisson approximations of subgraph counts in random graphs. Random Struct. Algorithms 18(1), 39–60 (2001)
Volant, S., Magniette, M.-L.M., Robin, S.: Variational Bayes approach for model aggregation in unsupervised classification with Markovian dependency. Comput. Stat. Data Anal. 56(8), 2375–2387 (2012)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)
Wolfe, P.J., Olhede, S.C.: Nonparametric graphon estimation. Technical report (2013). arXiv:1309.5936
Yang, J., Han, Q., Airoldi, E.: Nonparametric estimation and testing of exchangeable graph models. J. Mach. Learn. Res. Conf. Proc. 30, 1060–1067 (2014)
Zanghi, H., Ambroise, C., Miele, V.: Fast online graph clustering via Erdös Renyi mixture. Pattern Recognit. 41(12), 3592–3599 (2008)
Acknowledgments
The authors thanks Stevenn Volant for helpful comments and discussions. The authors also thank the anonymous reviewer for his helpful remarks on our work.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Inference of the function W
Proof of Proposition 1
The first part is straightforward, based on a conditioning of the binnings of u and v
We are now left with the calculation of
where
-
\(\mathbf{a},\,\varvec{\eta }\) and \(\varvec{\zeta }\) are the parameters of the variational Bayes posterior distributions;
-
\(b(\cdot ;\,\varvec{\eta },\,\varvec{\zeta })\) stands for the pdf of the Beta distribution \(\text{ Beta }(\varvec{\eta },\, \varvec{\zeta });\)
-
\(F_{q, \ell }(u,\,v;\,\mathbf{a})\) denotes the joint cdf of \((\sigma _q,\, \sigma _\ell ),\) as defined in (1), when \({\varvec{\alpha }}\) has a Dirichlet distribution \(\text{ Dir }(\mathbf{a}).\)
The last argument comes from Gouda and Szántai (2010) who give explicit recursions to compute the uni- and bi-variate cdf for the Dirichlet \(\text{ Dir }(\mathbf{a}),\) denoted \( G_{q}(u;\,\mathbf{a})\) and \(G_{q, \ell }(u,\, v;\, \mathbf{a}),\) respectively.
Reminding that the approximate variational posterior of \({\varvec{\alpha }}\) is \(\text{ Dir }(\mathbf{a})\) and using a simple property of the Dirichlet distribution
the calculation of \(F_{q, \ell }(u,\,v)\) follows as
where the \((s_q)\) are the cumulated parameters: \(s_q = \sum _{j=1}^q a_j.\) \(\square \)
1.2 Motif probability
Proof of Proposition 3
We directly write the approximate variational expectation
where
Furthermore, we have
so we end up with
and the proof is completed (Fig. 8). \(\square \)
Proof of Proposition 2
Because the \(Z_i\)’s are uniformly distributed over \([0;\,1],\) we have
\(\square \)
Rights and permissions
About this article
Cite this article
Latouche, P., Robin, S. Variational Bayes model averaging for graphon functions and motif frequencies inference in W-graph models. Stat Comput 26, 1173–1185 (2016). https://doi.org/10.1007/s11222-015-9607-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-015-9607-0