Estimating the Similarity of Community Detection Methods Based on Cluster Size Distribution

  • Vinh-Loc DaoEmail author
  • Cécile Bothorel
  • Philippe Lenca
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 812)


Detecting community structure discloses tremendous information about complex networks and unlock promising applied perspectives. Accordingly, a numerous number of community detection methods have been proposed in the last two decades with many rewarding discoveries. Notwithstanding, it is still very challenging to determine a suitable method in order to get more insights into the mesoscopic structure of a network given an expected quality, especially on large scale networks. Many recent efforts have also been devoted to investigating various qualities of community structure associated with detection methods, but the answer to this question is still very far from being straightforward. In this paper, we propose a novel approach to estimate the similarity between community detection methods using the size density distributions of communities that they detect. We verify our solution on a very large corpus of networks consisting in more than a hundred networks of five different categories and deliver pairwise similarities of 16 state-of-the-art and well-known methods. Interestingly, our result shows that there is a very clear distinction between the partitioning strategies of different community detection methods. This distinction plays an important role in assisting network analysts to identify their rule-of-thumb solutions.


Community detection Similarity metric Community size Comparative analysis 


  1. 1.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory Exp. 2008(10), P10,008 (2008)Google Scholar
  2. 2.
    Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6) (2004)Google Scholar
  3. 3.
    Coscia, M., Giannotti, F., Pedreschi, D.: A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 4(5), 512–546 (2011)Google Scholar
  4. 4.
    Dao, V.L., Bothorel, C., Lenca, P.: An empirical characterization of community structures in complex networks using a bivariate map of quality metrics. ArXiv e-prints (2018)Google Scholar
  5. 5.
    Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)Google Scholar
  6. 6.
    Fortunato, S., Barthelemy, M.: Resolution limit in community detection. Proc. Natl. Acad. Sci. 104(1), 36–41 (2006). Scholar
  7. 7.
    Ghasemian, A., Hosseinmardi, H., Clauset, A.: Evaluating overfit and underfit in models of network community structure. ArXiv e-prints (2018)Google Scholar
  8. 8.
    Hizanidis, J., Kouvaris, N.E., Zamora-López, G., Diaz-Guilera, A., Antonopoulos, C.G.: Chimera-like states in modular neural networks. Sci. Rep. (19845) (2016)Google Scholar
  9. 9.
    Jerome, K.: The koblenz network collection. In: Proceedings Conference on World Wide Web Companion, pp. 1343–1350 (2013).
  10. 10.
    Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Finding statistically significant communities in networks. PLoS ONE 6(4), e18,961 (2011)Google Scholar
  11. 11.
    Leskovec, J., Krevl, A.: SNAP Datasets: Stanford Large Network Dataset Collection (2014).
  12. 12.
    Meo, P.D., Ferrara, E., Fiumara, G., Provetti, A.: Mixing local and global information for community detection in large networks. J. Comput. Syst. Sci. 80(1), 72–87 (2014)Google Scholar
  13. 13.
    Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006). Scholar
  14. 14.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2) (2004)Google Scholar
  15. 15.
    Pons, P., Latapy, M.: Computing communities in large networks using random walks. In: Yolum, p., Güngör, T., Gürgen, F., Özturan, C. (eds.) Computer and Information Sciences - ISCIS 2005, pp. 284–293. Springer, Berlin (2005)Google Scholar
  16. 16.
    Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Natl. Acad. Sci. 101(9), 2658–2663 (2004)Google Scholar
  17. 17.
    Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3) (2007)Google Scholar
  18. 18.
    Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74(1) (2006)Google Scholar
  19. 19.
    Riolo, M.A., Cantwell, G.T., Reinert, G., Newman, M.E.J.: Efficient method for estimating the number of communities in a network. Phys. Rev. E 96(3) (2017)Google Scholar
  20. 20.
    Rossi, R.A., Ahmed, N.K.: The network data repository with interactive graph analytics and visualization. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015).
  21. 21.
    Rosvall, M., Axelsson, D., Bergstrom, C.T.: The map equation. Eur. Phys. J. Spec. Top. 178, 13–23 (2009). Scholar
  22. 22.
    Rosvall, M., Bergstrom, C.T.: An information-theoretic framework for resolving community structure in complex networks. Proc. Natl. Acad. Sci. 104(18), 7327–7331 (2007)Google Scholar
  23. 23.
    Schaub, M.T., Delvenne, J.C., Rosvall, M., Lambiotte, R.: The many facets of community detection in complex networks. Appl. Netw. Sci. 2(1) (2017)Google Scholar
  24. 24.
    Sekara, V., Stopczynski, A., Lehmann, S.: Fundamental structures of dynamic social networks. Proc. Natl. Acad. Sci. 113(36), 9977–9982 (2016)Google Scholar
  25. 25.
    Sheather, S.J., Jones, M.C.: A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. 53(3), 683–690 (1991)Google Scholar
  26. 26.
    Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)Google Scholar
  27. 27.
    Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)Google Scholar
  28. 28.
    Xie, J., Szymanski, B.K.: Towards linear time overlapping community detection in social networks. In: Advances in Knowledge Discovery and Data Mining, pp. 25–36. Springer, Berlin (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Vinh-Loc Dao
    • 1
    Email author
  • Cécile Bothorel
    • 1
  • Philippe Lenca
    • 1
  1. 1.IMT Atlantique - Lab-STICC CNRSBrestFrance

Personalised recommendations