Skip to main content
Log in

An efficient multilevel scheme for coarsening large scale social networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The explosive growth of data raised from social networks, hinders researchers from analysing them in a good way. So, is it possible to rapidly “zoom-out” from this huge network while preserving its whole structure? In fact, this technique is named “graph’s reduction” and represents a significant task in social networks’ analysis. Thus, several methods have been developed to pull a smaller succinct version of the graph. Some of them belong to the category of “graph sampling” and risk losing key characteristics of communities. Others are part of “coarsening strategy” and designed to cope with the problem of community discovering, which is our desired purpose. In this paper, we propose a multi-level coarsening algorithm called MCCA (Multi-level Coarsening Compact Areas). The main strategy of this algorithm is to merge well connected zones in every level by updating edge and vertex weight until a stopping criterion is met. Using real-world social networks, we evaluate the quality and scalability of MCCA. Furthermore, we compared it with eight known proposals. We also show how our method can be used as a preliminary step for community detection without much loss of information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/

  2. https://gephi.org/users/download/

  3. http://glaros.dtc.umn.edu/gkhome/metis/metis/download

  4. https://github.com/dlasalle/nerstrand

  5. http://www.cs.vt.edu/badityap/CODE/coarsenet.tgz

  6. http://algo2.iti.kit.edu/kahip/

  7. https://github.com/chronotable/coarsening-infnet

  8. http://www.eng.usf.edu/chakeri/SpectralSparsify.m

  9. http://www.cs.cmu.edu/jkoutis/SpectralAlgorithms.htm

  10. Code at: http://www.ludowaltman.nl/slm/

References

  1. Adamic AL, Lukose RM, Puniyani AR, Hubermna BA (2001) Search in power-law networks. Phys Rev E 64(46135):1–8

    Google Scholar 

  2. Alan M, Massimiliano M, Gummsdi KP, Peter D, Bobby B (2007) Measurement and analysis of online social networks

  3. Anand R, David UJ (2011) Mining of Massive Datasets. Cambridge University Press, Cambridge

    Google Scholar 

  4. Avrachenkov K, Ribeiro BF, Towsley D (2010) Improving random walk estimation accuracy with uniform restarts. In: Algorithms and Models for the Web-Graph - 7th International Workshop, WAW 2010. Proceedings, Stanford, pp 98–109

  5. Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks

  6. Bruce H, Robert L (1995) A multilevel algorithm for partitioning graphs. In: Proceedings of the 1995 ACM/IEEE Conference on Supercomputing. ACM

  7. Buluċ A, Meyerhenke H, Safro I, Sanders P, Schulz C (2016) Recent advances in graph partitioning. In: Algorithm Engineering - Selected Results and Surveys, pp 117–158

  8. Cédric C, Ilya S (2009) Comparison of coarsening schemes for multilevel graph partitioning. Springer, Berlin, pp 191–205

    Google Scholar 

  9. Chakeri A, Farhidzadeh H, Hall LO (2016) Spectral sparsification in spectral clustering. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp 2301–2306

  10. Chen H, Perozzi B, Hu Y, Skiena S (2017) HARP: hierarchical representation learning for networks. arXiv:1706.07845

  11. Chen H, Zhao J, Chen X, Xiao D, Shi C (2017) Visual analysis of large heterogeneous network through interactive centrality based sampling. In: 14th IEEE International Conference on Networking, Sensing and Control, ICNSC 2017, Calabria, pp 378–383

  12. Eunjoon C, Myers SA, Jure L (2011) Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1082–1090

  13. Fortunato S (2010) Community detection in graphs. Phys Rep 486:75–174

    Article  MathSciNet  Google Scholar 

  14. Fortunato S, Hric D (2016) Community detection in networks: a user guide. Phys Rep 659:1–44

    Article  MathSciNet  Google Scholar 

  15. George K, Vipin K (1995) Analysis of multilevel graph partitioning. ACM, NY, p 29

    Google Scholar 

  16. George K, Vipin K (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392

    Article  MathSciNet  MATH  Google Scholar 

  17. Glantz R, Meyerhenke H, Schulz C (2016) Tree-based coarsening and partitioning of complex networks. ACM Journal of Experimental Algorithmics 21(1):364–375

    MathSciNet  MATH  Google Scholar 

  18. Heuer T, Schlag S (2017) Improving Coarsening Schemes for Hypergraph Partitioning by Exploiting Community Structure. In: 16th International Symposium on Experimental Algorithms (SEA 2017), Leibniz International Proceedings in Informatics (LIPIcs), vol 75, pp 21:1–21:19

  19. Hu P, Lau WC (2013) A survey and taxonomy of graph sampling. CoRR

  20. Mcauley J, Jure L (2012) Discovering social circles in ego networks. arXiv:1210.8182

  21. Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113

    Article  Google Scholar 

  22. Watts DJ, Strogatz SH (1998) Collective dynamics of ’small-world’ networks. Nature 393(6684):440–442

    Article  MATH  Google Scholar 

  23. Jaewon Y, Jure L (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213

    Article  Google Scholar 

  24. Jiyoung WJ, Xin S, Dhillon IS (2012) Scalable and memory-efficient clustering of large-scale social networks. IEEE Computer Society, Washington, pp 705–714

    Google Scholar 

  25. Jure L, Christos F (2006) Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 631– 636

  26. Jure L, Lang KJ, Anirban D, Mahoney MW (2008) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. CoRR

  27. Kaur R, Singh S (2016) A survey of data mining and social network analysis based anomaly detection techniques. Egypt Inf J 17(2):199–216

    Article  Google Scholar 

  28. LaSalle D, Karypis G (2015) Multi-threaded modularity based graph clustering using the multilevel paradigm. J Parallel Distrib Comput 76:66–80

    Article  Google Scholar 

  29. Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data

  30. Ludo W, van Eck NJ (2013) A smart local moving algorithm for large-scale modularity-based community detection. arXiv:1308.6604

  31. Manos P, Gautam D, Nick K (2013) Sampling online social networks. IEEE Trans Knowl Data Eng 25 (3):662–676

    Article  Google Scholar 

  32. Ohsaka N, Sonobe T, Fujita S, Kawarabayashi Ki (2017) Coarsening massive influence networks for scalable diffusion analysis. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD ’17. ACM, pp 635– 650

  33. Purohit M (2014) Fast influence-based coarsening for large networks. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 1296–1305

  34. Rhouma D, Romdhane LB (2014) An efficient algorithm for community mining with overlap in social networks. Expert Syst Appl 41(9):4309–4321

    Article  Google Scholar 

  35. Maiya AS, Berger-Wolf TY (2010) Sampling community structure. ACM, NY, pp 701–710

    Google Scholar 

  36. Safro I, Sanders P, Schulz C (2012) Advanced coarsening schemes for graph partitioning. CoRR

  37. Sercan S, Gunduz OS, Sima EUA (2010) An efficient community detection method using parallel clique-finding ants. IEEE, Piscataway, pp 1–7

    Google Scholar 

  38. Lee SH, Kim PJ, Jeong H (2006) Statistical properties of sampled networks. Phys Rev E 73(1):016,102

    Article  Google Scholar 

  39. Tianyi W, Yang C, Zengbin Z, Tianyin X, Long J, Pan H, Beixing D, Xing L (2009) Understanding graph sampling algorithms for social network analysis

  40. Kang U, Christos F (2011) Beyond ‘caveman communities’: Hubs and spokes for graph compression and mining. ICDM ’11. IEEE Computer Society, Washington, pp 300–309

    Google Scholar 

  41. Vaishnavi K, Michalis F, Marek C, Li L, Jun-Hong C, Percus AG (2005) Reducing large internet topologies for faster simulations. In: Networking, vol 3462. Springer, pp 328–341

  42. Venu S, Srinivasan P, Yiye R (2011) Local graph sparsification for scalable clustering. ACM, NY, pp 721–732

    Google Scholar 

  43. Yiye R, David F, Jiongqian L, Yu W, Srinivasan P (2015) Community Discovery: Simple and Scalable Approaches. Springer International Publishing, Berlin, pp 23–54

    Google Scholar 

  44. Zhao J, Wang P, Lui JCS, Towsley D, Guan X (2017) Sampling online social networks by random walk with indirect jumps. arXiv:1708.09081

Download references

Acknowledgements

We would like to thank Dominique LaSalle, Naoto Ohsaka, Roland Glantz and Alireza Chakeri for their fruitful discussions about their proposed models as well as providing their source codes used in our simulations (respectively: Nerstrand, MaxInf, TREE and SPER). We also thank the anonymous reviewers for their valuable remarks which led to a substantial improvement in the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Delel Rhouma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rhouma, D., Ben Romdhane, L. An efficient multilevel scheme for coarsening large scale social networks. Appl Intell 48, 3557–3576 (2018). https://doi.org/10.1007/s10489-018-1162-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1162-x

Keywords

Navigation