The Hierarchy of Block Models

Abstract

There exist various types of network block models such as the Stochastic Block Model (SBM), the Degree Corrected Block Model (DCBM), and the Popularity Adjusted Block Model (PABM). While this leads to a variety of choices, the block models do not have a nested structure. In addition, there is a substantial jump in the number of parameters from the DCBM to the PABM. The objective of this paper is formulation of a hierarchy of block model which does not rely on arbitrary identifiability conditions. We propose a Nested Block Model (NBM) that treats the SBM, the DCBM and the PABM as its particular cases with specific parameter values, and, in addition, allows a multitude of versions that are more complicated than DCBM but have fewer unknown parameters than the PABM. The latter allows one to carry out clustering and estimation without preliminary testing, to see which block model is really true.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3

Code availability

Code will be available online from the website of the second author

Availability of data and material

No new data are associated with this paper

References

  1. Abbe, E. (2018). Community detection and stochastic block models: Recent developments. J Mach Learn Res 18, 1–86.

    MathSciNet  MATH  Google Scholar 

  2. Agarwal, P.K. and Mustafa, N.H. (2004). K-means projective clustering,.

  3. Airoldi, E.M., Blei, D.M., Fienberg, S.E. and Xing, E.P. (2008). Mixed membership stochastic blockmodels. J Mach Learn Res 9, 1981–2014.

    MATH  Google Scholar 

  4. Banerjee, D. and Ma, Z. (2017). Optimal hypothesis testing for stochastic block models with growing degrees. 1705.05305.

  5. Bickel, P.J. and Chen, A. (2009). A nonparametric view of network models and newman–girvan and other modularities. Proc. Nat. Acad. Sci.106, 21068–21073. https://doi.org/10.1073/pnas.0907096106, https://www.pnas.org/content/106/50/21068.full.pdf.

    MATH  Article  Google Scholar 

  6. Boult, T. and Gottesfeld Brown, L. (1991). Factorization-based segmentation of motions. pp 179–186. https://doi.org/10.1109/WVM.1991.212809.

  7. Bradley, P.S. and Mangasarian, O.L. (2000). k-plane clustering. J of Global Optimization 16, 23–32. https://doi.org/10.1023/A:1008324625522.

    MathSciNet  MATH  Article  Google Scholar 

  8. Chung, F. and Lu, L. (2002). The average distances in random graphs with given expected degrees. Proc. Nat. Acad. Sci. 99, 15879–15882. https://www.pnas.org/content/99/25/15879.full.pdf.

    MathSciNet  MATH  Article  Google Scholar 

  9. Crossley, N.A., Mechelli, A., Vértes, P E, Winton-Brown, T.T., Patel, A.X., Ginestet, C.E., McGuire, P. and Bullmore, E.T. (2013). Cognitive relevance of the community structure of the human brain functional coactivation network. National Acad. Sciences 110, 11583–11588.

    Article  Google Scholar 

  10. Elhamifar, E. and Vidal, R. (2009). Sparse subspace clustering,. https://doi.org/10.1109/CVPR.2009.5206547.

  11. Elhamifar, E. and Vidal, R. (2013). Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2765–2781. https://doi.org/10.1109/TPAMI.2013.57.

    Article  Google Scholar 

  12. Erdös, P. and Rényi, A. (1959). On random graphs i. Publ. Math. Debr. 6, 290.

    MATH  Google Scholar 

  13. Favaro, P., Vidal, R. and Ravichandran, A. (2011). A closed form solution to robust subspace estimation and clustering. IEEE Computer Society, Washington, DC, USA, CVPR ’11, pp 1801–1807 . https://doi.org/10.1109/CVPR.2011.5995365.

  14. Gangrade, A., Venkatesh, P., Nazer, B. and Saligrama, V. (2018). Testing changes in communities for the stochastic block model. 1812.00769.

  15. Gao, C. and Lafferty, J. (2017). Testing for global network structure using small subgraph statistics. 1710.00862.

  16. Gao, C., Ma, Z., Zhang, A.Y., Zhou, H.H. et al. (2018). Community detection in degree-corrected block models. Ann. Stat. 46, 2153–2185.

    MathSciNet  MATH  Google Scholar 

  17. Jin, J., Ke, Z.T. and Luo, S. (2017). Estimating network memberships by simplex vertex hunting. 1708.07852.

  18. Jin, J., Ke, Z. and Luo, S. (2018). Network global testing by counting graphlets, Dy, J. and Krause, A. (eds.) In Proceedings of the 35th International Conference on Machine Learning, PMLR, Stockholmsmässan, Stockholm Sweden, Proceedings of Machine Learning Research, vol 80, pp 2333–2341, http://proceedings.mlr.press/v80/jin18b.html.

  19. Karrer, B. and Newman, M.E.J. (2011). Stochastic blockmodels and community structure in networks. Phy. Rev. E., 016107. Statistical, nonlinear, and soft matter physics 83 1 Pt.

  20. Klopp, O., Lu, Y., Tsybakov, A.B. and Zhou, H.H. (2019). Structured matrix estimation and completion. Bernoulli 25, 3883–3911. https://doi.org/10.3150/19-BEJ1114.

    MathSciNet  MATH  Article  Google Scholar 

  21. Lei, J. (2016). A goodness-of-fit test for stochastic block models. Ann. Statist. 44, 401–424. https://doi.org/10.1214/15-AOS1370.

    MathSciNet  MATH  Google Scholar 

  22. Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43, 215–237. https://doi.org/10.1214/14-AOS1274.

    MathSciNet  MATH  Google Scholar 

  23. Li, T., Lei, L., Bhattacharyya, S., den Berge, K.V., Sarkar, P., Bickel, P.J. and Levina, E. (2020). Hierarchical community detection by recursive partitioning. J. Amer. Statist. Assoc. 0, 1–18. https://doi.org/10.1080/01621459.2020.1833888.

    Google Scholar 

  24. Liu, G., Lin, Z. and Yu, Y. (2010). Robust subspace segmentation by low-rank representation,.

  25. Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y. and Ma, Y. (2013). Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 171–184. https://doi.org/10.1109/TPAMI.2012.88.

    Article  Google Scholar 

  26. Lorrain, F. and White, H.C. (1971). Structural equivalence of individuals in social networks. J. Math. Sociol. 1, 49–80. https://doi.org/10.1080/0022250X.1971.9989788.

    Article  Google Scholar 

  27. Lyzinski, V., Tang, M., Athreya, A., Park, Y. and Priebe, C.E. (2017). Community detection and classification in hierarchical stochastic blockmodels. IEEE Trans. Netw. Sci. Eng. 4, 13–26. https://doi.org/10.1109/TNSE.2016.2634322.

    MathSciNet  Article  Google Scholar 

  28. Ma, S., Su, L. and Zhang, Y. (2019). Determining the number of communities in degree-corrected stochastic block models. 1809.01028.

  29. Ma, Y., Yang, A.Y., Derksen, H. and Fossum, R. (2008). Estimation of subspace arrangements with applications in modeling and segmenting mixed data. SIAM Rev. 50, 413–458 . https://doi.org/10.1137/060655523.

    MathSciNet  MATH  Article  Google Scholar 

  30. Mairal, J., Bach, F., Ponce, J., Sapiro, G., Jenatton, R. and Obozinski, G. (2014). Spams: A sparse modeling software, v2.3. http://spams-develgforgeinriafr/downloadshtml.

  31. Mukherjee, R. and Sen, S. (2017). Testing degree corrections in stochastic block models. 1705.07527.

  32. Nicolini, C., Bordier, C. and Bifone, A. (2017). Community detection in weighted brain connectivity networks beyond the resolution limit. Neuroimage 146, 28–39.

    Article  Google Scholar 

  33. Noroozi, M., Rimal, R. and Pensky, M. (2019). Sparse popularity adjusted stochastic block model. 1910.01931.

  34. Noroozi, M., Rimal, R. and Pensky, M. (2021). Estimation and clustering in popularity adjusted block model. J. R. Stat. Soc. Series B. Stat. Methodol.83, 2, 293–317.

    Article  Google Scholar 

  35. Pollard, D. (1990). Empirical processes: theory and applications,.

  36. Sengupta, S. and Chen, Y. (2018). A block model for node popularity in networks with community structure. J. R. Stat. Soc. Series B 80, 365–386.

    MathSciNet  MATH  Article  Google Scholar 

  37. Shi, J. and Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence 22, 888–905.

    Article  Google Scholar 

  38. Soltanolkotabi, M., Elhamifar, E. and Candes, E.J. (2014). Robust subspace clustering. Ann. Statist. 42, 669–699. https://doi.org/10.1214/13-AOS1199.

    MathSciNet  MATH  Article  Google Scholar 

  39. Tseng, P. (2000). Nearest q-flat to m points. J. Optim. Theory Appl.105, 249–252.

    MathSciNet  MATH  Article  Google Scholar 

  40. Vidal, R. (2011). Subspace clustering. IEEE Signal Process. Mag. 28, 52–68.

    Article  Google Scholar 

  41. Vidal, R., Ma, Y. and Sastry, S. (2005). Generalized principal component analysis (gpca). IEEE Trans. Pattern Anal. Mach. Intell. 27, 1945–1959.

    Article  Google Scholar 

  42. Wakita, K. and Tsurumi, T. (2007). Finding community structure in mega-scale social networks: [extended abstract],. https://doi.org/10.1145/1242572.1242805.

  43. Wang, B., Pourshafeie, A., Zitnik, M., Zhu, J., Bustamante, C.D., Batzoglou, S. and Leskovec, J. (2018). Network enhancement as a general method to denoise weighted biological networks. Nat. Commun. 9, 3108. https://doi.org/10.1038/s41467-018-05469-x.

    Article  Google Scholar 

  44. Zhao, Y., Levina, E., Zhu, J. et al. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Statist. 40, 2266–2292.

    MathSciNet  MATH  Article  Google Scholar 

Download references

Funding

Both authors of the paper were partially supported by National Science Foundation (NSF) grants DMS-1712977 and DMS-2014928

Author information

Affiliations

Authors

Corresponding author

Correspondence to Majid Noroozi.

Ethics declarations

Conflicts of interest/Competing interests

None

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Noroozi, M., Pensky, M. The Hierarchy of Block Models. Sankhya A (2021). https://doi.org/10.1007/s13171-021-00247-2

Download citation

Keywords and phrases

  • Stochastic block model
  • Degree corrected block model
  • Popularity adjusted block model
  • Sparse subspace clustering
  • Spectral clustering with k-median

AMS (2000) subject classification

  • 05C80
  • 62F12
  • 62H30