Advertisement

Machine Learning

, Volume 82, Issue 2, pp 157–189 | Cite as

Detecting communities and their evolutions in dynamic social networks—a Bayesian approach

  • Tianbao Yang
  • Yun Chi
  • Shenghuo Zhu
  • Yihong Gong
  • Rong Jin
Article

Abstract

Although a large body of work is devoted to finding communities in static social networks, only a few studies examined the dynamics of communities in evolving social networks. In this paper, we propose a dynamic stochastic block model for finding communities and their evolution in a dynamic social network. The proposed model captures the evolution of communities by explicitly modeling the transition of community memberships for individual nodes in the network. Unlike many existing approaches for modeling social networks that estimate parameters by their most likely values (i.e., point estimation), in this study, we employ a Bayesian treatment for parameter estimation that computes the posterior distributions for all the unknown parameters. This Bayesian treatment allows us to capture the uncertainty in parameter values and therefore is more robust to data noise than point estimation. In addition, an efficient algorithm is developed for Bayesian inference to handle large sparse social networks. Extensive experimental studies based on both synthetic data and real-life data demonstrate that our model achieves higher accuracy and reveals more insights in the data than several state-of-the-art algorithms.

Keywords

Social network Community Community evolution Dynamic stochastic block model Bayesian inference Gibbs sampling 

References

  1. Ahmed, A., & Xing, E. P. (2008). Dynamic non-parametric mixture models and the recurrent Chinese restaurant process: with applications to evolutionary clustering. In SDM (pp. 219–230). Philadelphia: SIAM. Google Scholar
  2. Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2006). Mixed membership stochastic block models for relational data with application to protein-protein interactions. In Proceedings of the international biometrics society annual meeting. Google Scholar
  3. Asur, S., Parthasarathy, S., & Ucar, D. (2007). An event-based framework for characterizing the evolutionary behavior of interaction graphs. In Proceedings of the 13th ACM SIGKDD conference. Google Scholar
  4. Bishop, C. M. (2006). Pattern recognition and machine learning (information science and statistics). New York: Springer. Google Scholar
  5. Brandes, U., Delling, D., Gaertler, M., Gorke, R., Hoefer, M., Nikoloski, Z., & Wagner, D. (2008). On modularity clustering. IEEE Transactions on Knowledge and Data Engineering, 20(2). Google Scholar
  6. Chakrabarti, D., Kumar, R., & Tomkins, A. (2006). Evolutionary clustering. In Proceedings of the 12th ACM SIGKDD conference. Google Scholar
  7. Chen, J., Zaiane, O. R., & Goebel, R. (2009). Detecting communities in social networks using max-min modularity. In SDM’09: proceedings of the 9th SIAM international conference on data mining. Google Scholar
  8. Chi, Y., Song, X., Zhou, D., Hino, K., & Tseng, B. L. (2007). Evolutionary spectral clustering by incorporating temporal smoothness. In Proceedings of the 13th ACM SIGKDD conference. Google Scholar
  9. Chung, F. R. K. (1997). Spectral graph theory. Providence: American Mathematical Society. zbMATHGoogle Scholar
  10. Fienberg, S. E., Meyer, M. M., & Wasserman, S. S. (1985). Statistical analysis of multiple sociometric relations. Journal of the American Statistical Association, 80(389). Google Scholar
  11. Flake, G., Lawrence, S., & Giles, C. (2000). Efficient identification of web communities. In Proceedings of the 6th ACM SIGKDD conference. Google Scholar
  12. Freeman, L. C. (2003). Finding social groups: a meta-analysis of the southern women data. New York: National Academies Press. Google Scholar
  13. Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6. Google Scholar
  14. Gong, Y., & Xu, W. (2007). Machine learning for multimedia content analysis. Berlin: Springer. Google Scholar
  15. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1), 5228–5235. CrossRefGoogle Scholar
  16. Ho, P. D., Raftery, A. E., & H. M. S. (2002). Statistical analysis of multiple sociometric relations. Latent Space Approaches to Social Network Analysis, 97. Google Scholar
  17. Hofman, J. M., & Wiggins, C. H. (2008). A Bayesian approach to network modularity. Physical Review Letters, 100. Google Scholar
  18. Holland, P., & Leinhardt, S. (1976). Local structure in social networks. Sociological Methodology. Google Scholar
  19. Kemp, C., Griffiths, T. L., & Tenenbaum, J. B. (2004). In Discovering latent classes in relational data (Tech. Rep. AI Memo 2004-019). MIT, Computer Science and Artificial Intelligence Laboratory. Google Scholar
  20. Kim, M. S., & Han, J. (2009). A particle-and-density based evolutionary clustering method for dynamic networks. Proceedings VLDB Endowment, 2(1), 622–633. Google Scholar
  21. Kumar, R., Novak, J., Raghavan, P., & Tomkins, A. (2003). On the bursty evolution of blogspace. In Proceedings of the 12th WWW conference. Google Scholar
  22. Leskovec, J., Kleinberg, J., & Faloutsos, C. (2005). Graphs over time: densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD conference. Google Scholar
  23. Lin, Y. R., Chi, Y., Zhu, S., Sundaram, H., & Tseng, B. L. (2008). FacetNet: a framework for analyzing communities and their evolutions in dynamic networks. In Proceedings of the 17th WWW conference. Google Scholar
  24. Lin, Y. R., Chi, Y., Zhu, S., Sundaram, H., & Tseng, B. L. (2009a). Analyzing communities and their evolutions in dynamic social networks. ACM Transactions on Knowledge Discovery from Data, 3(2). Google Scholar
  25. Lin, Y. R., Sun, J., Castro, P., Konuru, R., Sundaram, H., & Kelliher, A. (2009b). Metafac: community discovery via relational hypergraph factorization. In KDD ’09: proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 527–536). New York: ACM. doi: 10.1145/1557019.1557080. CrossRefGoogle Scholar
  26. Mei, Q., & Zhai, C. (2005). Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In Proceedings of the 11th ACM SIGKDD conference. Google Scholar
  27. Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America Google Scholar
  28. Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E. Google Scholar
  29. Palla, G., Barabasi, A. L., & Vicsek, T. (2007). Quantifying social group evolution. Nature, 446. Google Scholar
  30. Sarkar, P., & Moore, A. W. (2005). Dynamic social network analysis using latent space models. SIGKDD Exploration Newsletter, 7(2). Google Scholar
  31. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8). Google Scholar
  32. Shortreed, S., Handcock, M. S., & Hoff, P. (2006). A particle-and-density based evolutionary clustering method for dynamic networks. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 2(1), 24–33. CrossRefGoogle Scholar
  33. Snijders, T. A. B. (2002). Markov chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure, 3. Google Scholar
  34. Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y., & Schult, R. (2006). Monic: modeling and monitoring cluster transitions. In Proceedings of the 12th ACM SIGKDD conference. Google Scholar
  35. Sun, J., Faloutsos, C., Papadimitriou, S., & Yu, P. S. (2007). GraphScope: parameter-free mining of large time-evolving graphs. In Proceedings of the 13th SIGKDD conference. Google Scholar
  36. Tang, L., Liu, H., Zhang, J., & Nazeri, Z. (2008). Community evolution in dynamic multi-mode networks. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 677–685). New York: ACM. doi: 10.1145/1401890.1401972. CrossRefGoogle Scholar
  37. Tantipathananandh, C., Berger-Wolf, T., & Kempe, D. (2007). A framework for community identification in dynamic social networks. In Proceedings of the 13th ACM SIGKDD conference. Google Scholar
  38. Toyoda, M., & Kitsuregawa, M. (2003). Extracting evolution of web communities from a series of web archives. In HYPERTEXT ’03: proceedings of the 14th ACM conference on hypertext and hypermedia. Google Scholar
  39. Wasserman, S., & Faust, K. (1994). Social network analysis: methods and applications. Cambridge: Cambridge University Press. Google Scholar
  40. Wasserman, S., & Pattison, P. (1996). Logit models and logistic regressions for social networks, I: an introduction to Markov graphs and p*. Psychometrika, 60. Google Scholar
  41. White, S., & Smyth, P. (2005). A spectral clustering approach to finding communities in graph. In SDM. Google Scholar
  42. Xu, W., & Gong, Y. (2004). Document clustering by concept factorization. In SIGIR (pp. 202–209). Google Scholar
  43. Yang, T., Chi, Y., Zhu, S., Gong, Y., & Jin, R. (2009). A Bayesian approach toward finding communities and their evolutions in dynamic social networks. In SDM’09: proceedings of the 2009 SIAM international conference on data mining (pp. 990–1001). Google Scholar
  44. Yu, K., Yu, S., & Tresp, V. (2005). Soft clustering on graphs. In NIPS. Google Scholar
  45. Zhu, X. (2005). Semi-supervised learning with graphs. PhD thesis, Carnegie Mellon University. Google Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  • Tianbao Yang
    • 1
  • Yun Chi
    • 2
  • Shenghuo Zhu
    • 2
  • Yihong Gong
    • 2
  • Rong Jin
    • 1
  1. 1.Department of Computer Science and Engineering Michigan State UniversityEast LansingUSA
  2. 2.NEC Laboratories AmericaCupertinoUSA

Personalised recommendations