We study a random graph model called the “stochastic block model” in statistics and the “planted partition model” in theoretical computer science. In its simplest form, this is a random graph with two equal-sized classes of vertices, with a within-class edge probability of q and a between-class edge probability of q′.
A striking conjecture of Decelle, Krzkala, Moore and Zdeborová , based on deep, non-rigorous ideas from statistical physics, gave a precise prediction for the algorithmic threshold of clustering in the sparse planted partition model. In particular, if q=a/n and q′=b/n, s=(a−b)/2 and d=(a+b)/2, then Decelle et al. conjectured that it is possible to efficiently cluster in a way correlated with the true partition if s2>d and impossible if s2<d. By comparison, until recently the best-known rigorous result showed that clustering is possible if s2>Cdlnd for sufficiently large C.
In a previous work, we proved that indeed it is information theoretically impossible to cluster if s2 ≤ d and moreover that it is information theoretically impossible to even estimate the model parameters from the graph when s2 < d. Here we prove the rest of the conjecture by providing an efficient algorithm for clustering in a way that is correlated with the true partition when s2>d. A different independent proof of the same result was recently obtained by Massoulié .
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
K. B. Athreya and P. E. Ney: Branching processes, Springer-Verlag, New York, 1972. Die Grundlehren der mathematischen Wissenschaften, Band 196.
P. J. Bickel and A. Chen: A nonparametric view of network models and Newman-Girvan and other modularities, Proceedings of the National Academy of Sciences 106 (2009), 21068–21073.
R. B. Boppana: Eigenvalues and graph bisection: An average-case analysis, in: 28th Annual Symposium on Foundations of Computer Science, 280–285. IEEE, 1987.
C. Bordenave: A new proof of Friedman’s second eigenvalue theorem and its extension to random lifts. arXiv preprint arXiv:1502.04482, 2015.
C. Bordenave, M. Lelarge and L. Massouli: Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs. arXiv preprint arXiv:1501.06087, 2015.
T. N. Bui, S. Chaudhuri, F. T. Leighton and M. Sipser: Graph bisection algorithms with good average case behavior, Combinatorica 7 (1987), 171–191.
A. Coja-Oghlan: Graph partitioning via adaptive spectral techniques, Combinatorics, Probability and Computing 19 (2010), 227–284.
A. Condon and R. M. Karp: Algorithms for graph partitioning on the planted partition model, Random Structures and Algorithms 18 (2001), 116–140.
A. Decelle, F. Krzakala, C. Moore and L. Zdeborová: Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications, Physics Review E 84 (2011) 066106.
M. E. Dyer and A. M. Frieze: The solution of some random NP-hard problems in polynomial expected time, Journal of Algorithms 10 (1989), 451–489.
L. Erdős, A. Knowles, H.-T. Yau and J. Yin: Spectral statistics of Erdős-Rényi graphs II: eigenvalue spacing and the extreme eigenvalues, Communications in Mathematical Physics 314 (2012), 587–640.
U. Feige and E. Ofek: Spectral techniques applied to sparse random graphs, Random Structures & Algorithms 27 (2005), 251–275.
A. Flaxman, A. Frieze and T. Fenner: High degree vertices and eigenvalues in the preferential attachment graph, Internet Math. 2 (2005), 1–19.
O. Guédon and R. Vershynin: Community detection in sparse networks via grothendieck’s inequality. arXiv preprint arXiv:1411.4686, 2014.
P. W. Holland, K. B. Laskey and S. Leinhardt: Stochastic blockmodels: First steps, Social Networks 5 (1983), 109–137.
M. Jerrum and G. B. Sorkin: The Metropolis algorithm for graph bisection, Discrete Applied Mathematics 82 (1998), 155–175.
H. Kesten and B. P. Stigum: Additional limit theorems for indecomposable multidimensional Galton-Watson processes, Ann. Math. Statist. 37 (1966), 1463–1481.
F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, Zdeborova L and P. Zhang: Spectral redemption: clustering sparse networks. arXiv:1306.5550, 2013.
J. Leskovec, K. J. Lang, A. Dasgupta and M. W. Mahoney: Statistical properties of community structure in large social and information networks, in: Proceeding of the 17th international conference on World Wide Web, 695–704. ACM, 2008.
L. Massoulié: Community detection thresholds and the weak ramanujan property, in: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, 694–703. ACM, 2014.
F. McSherry: Spectral partitioning of random graphs, in: 42nd IEEE Symposium on Foundations of Computer Science, 529–537. IEEE, 2001.
E. Mossel, J. Neeman and A. Sly: Stochastic block models and reconstruction, Probability Theory and Related Fields, 2014, (to appear).
R. R. Nadakuditi and M. E. J Newman: Graph spectra and the detectability of community structure in networks, Physical Review Letters 108 (2012), 188701.
K. Rohe, S. Chatterjee and B. Yu: Spectral clustering and the high-dimensional stochastic blockmodel, The Annals of Statistics 39 (2011), 1878–1915.
B. Roos: Binomial approximation to the poisson binomial distribution: The krawtchouk expansion, Theory of Probability and Its Applications 45 (2001), 258–272.
T. A. B. Snijders and K. Nowicki: Estimation and prediction for stochastic block-models for graphs with latent block structure, Journal of Classification 14 (1997), 75–100.
S. H. Strogatz: Exploring complex networks, Nature 410 (2001), 268–276.
T. Tao and V. Vu: Random matrices: the circular law, Communications in Contemporary Mathematics 10 (2008), 261–307.
P. M. Wood: Universality and the circular law for sparse random matrices, The Annals of Applied Probability 22 (2012), 1266–1300.
S.-Y. Yun and A. Proutiere: Community detection via random and adaptive sampling, arXiv preprint arXiv:1402.3072, 2014.
Supported by NSF grant DMS-1106999, NSF grant CCF 1320105 and DOD ONR grant N000141110140.
Supported by NSF grant DMS-1106999 and DOD ONR grant N000141110140.
Supported by an Alfred Sloan Fellowship and NSF grant DMS-1208338.
About this article
Cite this article
Mossel, E., Neeman, J. & Sly, A. A Proof of the Block Model Threshold Conjecture. Combinatorica 38, 665–708 (2018). https://doi.org/10.1007/s00493-016-3238-8
Mathematics Subject Classification (2000)