A LexDFS-Based Approach on Finding Compact Communities

  • Jean Creusefond
  • Thomas Largillier
  • Sylvain Peyronnet
Chapter

Abstract

This article presents an efficient hierarchical clustering algorithm based on a graph traversal algorithm called LexDFS. This traversal algorithm has the property of going through the clustered parts of the graph in a small number of iterations, making them recognisable. The time complexity of our method is in O(n × log(n)). It is simple to implement and a thorough study shows that it outputs clusterings that are closer to some ground-truths than its competitors. Experiments are also carried out to analyse the behaviour of the algorithm during execution on sample graphs. This article also features a quality function called compactness, which measures how efficient is the cluster for internal communications. We prove that this quality function features interesting theoretical properties.

Keywords

Community detection Compactness LexDFS 

References

  1. 1.
    Adamcsek B, Palla G, Farkas I, Derényi I, Vicsek T. CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics. 2006;22(8):1021–23CrossRefGoogle Scholar
  2. 2.
    Aldecoa R, Marín I. Surprise maximization reveals the community structure of complex networks. Sci Rep 2013;3. http://www.nature.com/articles/srep01060?WT.ec_id=SREP-631-20130201 and http://www.nature.com/articles/srep02930
  3. 3.
    Bagga A, Baldwin B. Entity-based cross-document coreferencing using the vector space model. In: Proceedings of the 17th international conference on computational linguistics, vol. 1. Stroudsburg: Association for Computational Linguistics; 1998. P. 79–85Google Scholar
  4. 4.
    Barabási AL, Albert R. Emergence of scaling in random networks. Science 1999; 286(5439):509–12MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech: Theory Exp 2008; 2008(10): P10,008Google Scholar
  6. 6.
    Brandes U, Delling D, Gaertler M, Gorke R, Hoefer M, Nikoloski Z, et al. On modularity clustering. IEEE Trans Knowl Data Eng 2008;20(2):172–88CrossRefMATHGoogle Scholar
  7. 7.
    Chakraborty T, Sikdar S, Ganguly N, Mukherjee A. Citation interactions among computer science fields: a quantitative route to the rise and fall of scientific research. Soc Netw Anal Min 2014;4(1):1–18CrossRefGoogle Scholar
  8. 8.
    Chakraborty T, Sikdar S, Tammana V, Ganguly N, Mukherjee A. Computer science fields as ground-truth communities: their impact, rise and fall. In: Proceedings of advances in social networks analysis and mining (ASONAM). New York: ACM, 2013. P. 426–33Google Scholar
  9. 9.
    Chakraborty T, Srinivasan S, Ganguly N, Mukherjee A, Bhowmick S. On the permanence of vertices in network communities. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2014. New York, NY: ACM; 2014. P. 1396–405Google Scholar
  10. 10.
    Clauset A, Newman M, Moore C. Finding community structure in very large networks. Phys Rev E 2004;70(6). http://journals.aps.org/pre/abstract/10.1103/PhysRevE.70.066111
  11. 11.
    Corneil DG, Dalton B, Habib M. LDFS-based certifying algorithm for the minimum path cover problem on cocomparability graphs. SIAM J Comput 2013;42(3):792–807MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Corneil DG, Krueger RM. A unified view of graph searching. SIAM J Discr Math 2008;22(4):1259–276MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Creusefond J, Largillier T, Peyronnet S. Finding compact communities in large graphs. In: Proceedings of advances in social networks analysis and mining (ASONAM), 2015. ACM; 2015. P. 1457–464Google Scholar
  14. 14.
    Creusefond J, Largillier T, Peyronnet S. On the evaluation potential of quality functions in community detection for different contexts. In: Advances in network science. Springer; 2016. P. 111–125Google Scholar
  15. 15.
    Flake GW, Lawrence S, Giles CL. Efficient identification of Web communities. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. New York: ACM, 2000. P. 150–60Google Scholar
  16. 16.
    Fortunato S. Community detection in graphs. Phys Rep 2010;486(3–5):75–174MathSciNetCrossRefGoogle Scholar
  17. 17.
    Fortunato S, Barthelemy M. Resolution limit in community detection. Proc Natl Acad Sci 2007;104(1):36–41CrossRefGoogle Scholar
  18. 18.
    Girvan M, Newman ME. Community structure in social and biological networks. Proc Natl Acad Sci 2002;99(12):7821–826Google Scholar
  19. 19.
    Hansen P, Jaumard B. Minimum sum of diameters clustering. J Class 1987;4(2):215–26MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Hu Y. Efficient, high-quality force-directed graph drawing. Math J 2005;10(1):37–71MathSciNetGoogle Scholar
  21. 21.
    Kannan R, Vempala S, Vetta A. On clusterings: good, bad and spectral. J ACM (JACM) 2004;51(3):497–515MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Klimt B, Yang Y. Introducing the enron corpus. In: CEAS. 2004MATHGoogle Scholar
  23. 23.
    Lancichinetti A, Fortunato S, Kertész J. Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 2009;11(3):033015CrossRefGoogle Scholar
  24. 24.
    Leskovec J, Kleinberg J, Faloutsos C. Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 2007;1(1):2CrossRefGoogle Scholar
  25. 25.
    Leskovec J, Lang KJ, Dasgupta A, Mahoney MW. Statistical properties of community structure in large social and information networks. In: Proceedings of the 17th international conference on World Wide Web. ACM; 2008. P. 695–704Google Scholar
  26. 26.
    Leskovec J, Lang KJ, Mahoney M. Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World wide web. ACM; 2010. P. 631–40Google Scholar
  27. 27.
    Leskovec J, Mcauley JJ. Learning to discover social circles in ego networks. In: Advances in neural information processing systems; 2012. P. 539–47Google Scholar
  28. 28.
    Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B. Measurement and analysis of online social networks. In: Proceedings of the 5th ACM/Usenix internet measurement conference (IMC 2007), San Diego, CA; 2007Google Scholar
  29. 29.
    Newman ME, Girvan M. Finding and evaluating community structure in networks. Phys Rev E 2004;69(2):026113CrossRefGoogle Scholar
  30. 30.
    Pons P, Latapy M. Computing communities in large networks using random walks. J Graph Algorithms Appl 2006;10(2):191–218MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Defining and identifying communities in networks. Proc Natl Acad Sci USA 2004;101(9):2658–2663CrossRefGoogle Scholar
  32. 32.
    Raghavan U, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 2007;76(3). http://journals.aps.org/pre/abstract/10.1103/PhysRevE.76.036106
  33. 33.
    Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 2008;105(4):1118–123CrossRefGoogle Scholar
  34. 34.
    Seidman SB. Network structure and minimum degree. Soc Netw 1983;5(3):269–87MathSciNetCrossRefGoogle Scholar
  35. 35.
    Šubelj L, Bajec M. Model of complex networks based on citation dynamics. In: Proceedings of the 22nd international conference on World Wide Web; 2013. P. 527–30Google Scholar
  36. 36.
    Tarjan RE. Efficiency of a good but not linear set union algorithm. J ACM (JACM) 1975;22(2):215–25MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Traag VA, Krings G, Van Dooren P. Significant scales in community structure. Sci Rep 2013;3. http://www.nature.com/articles/srep01060?WT.ec_id=SREP-631-20130201 and http://www.nature.com/articles/srep02930
  38. 38.
    van Dongen S. Graph clustering by flow simulation. Ph.D. thesis (2000)Google Scholar
  39. 39.
    Van Laarhoven T, Marchiori E.: Axioms for graph clustering quality functions. J Mach Learn Res 2014;15(1):193–215MathSciNetMATHGoogle Scholar
  40. 40.
    Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature 1998;393(6684):440–42CrossRefGoogle Scholar
  41. 41.
    Yang J, Leskovec J. Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 2012;42(1):81–213Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jean Creusefond
    • 1
  • Thomas Largillier
    • 1
  • Sylvain Peyronnet
    • 2
  1. 1.Normandy UniversityCaenFrance
  2. 2.ix-labsParisFrance

Personalised recommendations