Learning with multi-resolution overlapping communities

Abstract

A recent surge of participatory web and social media has created a new laboratory for studying human relations and collective behavior on an unprecedented scale. In this work, we study the predictive power of social connections to determine the preferences or behaviors of individuals such as whether a user supports a certain political view, whether one likes a product, whether she would like to vote for a presidential candidate, etc. Since an actor is likely to participate in multiple different communities with each regulating the actor’s behavior in varying degrees, and a natural hierarchy might exist between these communities, we propose to zoom into a network at multiple different resolutions and determine which communities reflect a targeted behavior. We develop an efficient algorithm to extract a hierarchy of overlapping communities. Empirical results on social media networks demonstrate the promising potential of the proposed approach in real-world applications.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Notes

  1. 1.

    www.blogcatalog.com.

  2. 2.

    www.flickr.com.

  3. 3.

    www.youtube.com.

  4. 4.

    http://leitang.net/social_dimension.html.

References

  1. 1.

    Ahn Y-Y, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466:761–764

    Article  Google Scholar 

  2. 2.

    Blondel V, Guillaume J, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008:P10008

  3. 3.

    Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):2

    Article  Google Scholar 

  4. 4.

    Chakrabarti S, Dom B, Indyk P (1998) Enhanced hypertext categorization using hyperlinks. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data. ACM, New York, pp 307–318

  5. 5.

    Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066111+

    Google Scholar 

  6. 6.

    Evans TS, Lambiotte R (2009) Line graphs, link partitions, and overlapping communities. Phys Rev E 80(1):16105

    Article  Google Scholar 

  7. 7.

    Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9:1871–1874

    MATH  Google Scholar 

  8. 8.

    Gallagher B, Tong H, Eliassi-Rad T, Faloutsos C (2008) Using ghost edges for classification in sparsely labeled networks. In: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 256–264

  9. 9.

    Geman S, Geman D (1984) Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:452–472

    Google Scholar 

  10. 10.

    Getoor L, Taskar B (eds) (2007) Introduction to statistical relational learning. The MIT Press, Cambridge

  11. 11.

    Gregory S (2007) An algorithm to find overlapping community structure in networks. In: Proceedings of the 11th European conference on principles and practice of knowledge discovery in databases, pp 91–102

  12. 12.

    Hechter M (1988) Principles of group solidarity. University of California Press, London

    Google Scholar 

  13. 13.

    Hopcroft J, Khan O, Kulis B, Selman B (2003) Natural communities in large linked networks. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 541–546

  14. 14.

    Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11:033015

    Article  Google Scholar 

  15. 15.

    Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Statistical properties of community structure in large social and information networks. In: Proceeding of the 17th international conference on World Wide Web. ACM, New York, pp 695–704

  16. 16.

    Lin Z, Lyu MR, King I (2012) Matchsim: a novel similarity measure based on maximum neighborhood matching. Knowl Inf Syst 32(1):141–166

    Google Scholar 

  17. 17.

    Liu K, Tang L (2011) Large scale behavioral targeting with a social twist. In: Proceeding of the 20th ACM conference on Information and knowledge management, pp 1815–1824

  18. 18.

    Lu Q, Getoor L (2003) Link-based classification. In: Proceedings of the twentieth international conference on machine learning

  19. 19.

    Macskassy SA, Provost F (2003) A simple relational classifier. In: Proceedings of the multi-relational data mining workshop (MRDM) at the ninth ACM SIGKDD international conference on knowledge discovery and data mining

  20. 20.

    Macskassy SA, Provost F (2007) Classification in networked data: a toolkit and a univariate case study. J Mach Learn Res 8:935–983

    Google Scholar 

  21. 21.

    McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27:415–444

    Article  Google Scholar 

  22. 22.

    Menon AK, Elkan C (2010) Predicting labels for dyadic data. Data Min Knowl Discov 21(2):327–343

    MathSciNet  Article  Google Scholar 

  23. 23.

    Neville J, Jensen D (2005) Leveraging relational autocorrelation with latent group models. In: Proceedings of the 4th international workshop on multi-relational mining. ACM, New York, pp 49–55

  24. 24.

    Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113

    Article  Google Scholar 

  25. 25.

    Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435:814–818

    Article  Google Scholar 

  26. 26.

    Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47

    Article  Google Scholar 

  27. 27.

    Shen H, Cheng X, Cai K, Hu M-B (2009) Detect overlapping and hierarchical community structure in networks. Phys A Stat Mech Its Appl 388(8):1706–1712

    Article  Google Scholar 

  28. 28.

    Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley, Reading

    Google Scholar 

  29. 29.

    Tang L, Liu H (2009a) Relational learning via latent social dimensions. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 817–826

  30. 30.

    Tang L, Liu H (2009b) Scalable learning of collective behavior based on sparse social dimensions. In: Proceeding of the 18th ACM conference on information and knowledge management. ACM, New York, pp 1107–1116

  31. 31.

    Tang L, Rajan S, Narayanan VK (2009) Large scale multi-label classification via metalabeler. In: Proceedings of the 18th international conference on World wide web. ACM, New York, pp 211–220

  32. 32.

    Tang L, Wang X, Liu H (2009) Uncovering groups via heterogeneous interaction analysis. In: ICDM, Miami, FL, USA

  33. 33.

    Tang L, Wang X, Liu H (2011) Group profiling for understanding social structures. ACM Trans Intell Syst Technol (TIST) 3(1), article 15

    Google Scholar 

  34. 34.

    Tang L, Wang X, Liu H, Wang L (2010) A multi-resolution approach to learning with overlapping communities. In: KDD workshop on social media analytics

  35. 35.

    Wakita K, Tsurumi T (2007) Finding community structure in mega-scale social networks: [extendedabstract]. In: Proceedings of the 16th international conference on World Wide Web. ACM, New York, pp 1275–1276

  36. 36.

    Wang X, Tang L, Gao H, Liu H (2010) Discovering overlapping groups in social media. In: The 10th IEEE international conference on data mining series, Australia, Sydney

  37. 37.

    Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442

    Article  Google Scholar 

  38. 38.

    Wen Z, Lin C-Y (2010) On the quality of inferring interests from social neighbors. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining

  39. 39.

    Yu K, Yu S, Tresp V (2005) Soft clustering on graphs. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems. MIT Press, Cambridge

Download references

Acknowledgments

We appreciate the authors of  [14] for sharing their source code for our empirical study. We thank the reviewers for their insightful comments. This work is, in part, sponsored by AFOSR and ONR.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Xufei Wang.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wang, X., Tang, L., Liu, H. et al. Learning with multi-resolution overlapping communities. Knowl Inf Syst 36, 517–535 (2013). https://doi.org/10.1007/s10115-012-0555-0

Download citation

Keywords

  • Multi-resolution
  • Overlapping communities
  • Hierarchical clustering
  • Social dimensions
  • Network-based classification