Online Sampling of High Centrality Individuals in Social Networks

  • Arun S. Maiya
  • Tanya Y. Berger-Wolf
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6118)

Abstract

In this work, we investigate the use of online or “crawling” algorithms to sample large social networks in order to determine the most influential or important individuals within the network (by varying definitions of network centrality). We describe a novel sampling technique based on concepts from expander graphs. We empirically evaluate this method in addition to other online sampling strategies on several real-world social networks. We find that, by sampling nodes to maximize the expansion of the sample, we are able to approximate the set of most influential individuals across multiple measures of centrality.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Freeman, L.C.: Centrality in social networks. Social Networks 1, 215–239 (1979)CrossRefGoogle Scholar
  2. 2.
    Wasserman, S., Faust, K., Iacobucci, D.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (November 1994)Google Scholar
  3. 3.
    Bavelas, A.: Communication patterns in task-oriented groups. J. Acoustical Soc. of Am. 22(6), 725–730 (1950)CrossRefGoogle Scholar
  4. 4.
    Russo, T., Koesten, J.: Prestige, centrality, and learning: A social network analysis of an online class. Communication Education 54(3), 254–261 (2005)CrossRefGoogle Scholar
  5. 5.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab (1998)Google Scholar
  6. 6.
    Anthonisse, J.: The rush in a graph. Mathematische Centrum, Amsterdam (1971)Google Scholar
  7. 7.
    Freeman, L.: A set of measures of centrality based on betweenness. Sociometry 40, 35–41 (1977)CrossRefGoogle Scholar
  8. 8.
    Bonacich, P.: Power and centrality: A family of measures. American J. Sociology 92(5), 1170–1182 (1987)CrossRefGoogle Scholar
  9. 9.
    Boldi, P., Santini, M., Vigna, S.: Paradoxical effects in pagerank incremental computations. In: Workshop on Web Graphs (2004)Google Scholar
  10. 10.
    Abiteboul, S., Preda, M., Cobena, G.: Adaptive on-line page importance computation. In: WWW (2003)Google Scholar
  11. 11.
    Cho, J., Molina, H.G., Page, L.: Efficient crawling through url ordering. Computer Networks and ISDN Systems 30(1-7), 161–172 (1998)CrossRefGoogle Scholar
  12. 12.
    Najork, M.: Breadth-first search crawling yields high-quality pages. In: WWW 2001 (2001)Google Scholar
  13. 13.
    Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: KDD 2005 (2005)Google Scholar
  14. 14.
    Krishnamurthy, V., Faloutsos, M., Chrobak, M., Cui, J., Lao, L., Percus, A.: Sampling large internet topologies for simulation purposes. Computer Networks 51(15), 4284–4302 (2007)CrossRefGoogle Scholar
  15. 15.
    Hubler, C., Kriegel, H.P., Borgwardt, K., Ghahramani, Z.: Metropolis algorithms for representative subgraph sampling. In: ICDM 2008 (2008)Google Scholar
  16. 16.
    Hoory, S., Linial, N., Wigderson, A.: Expander graphs and their applications. Bull. Amer. Math. Soc. 43, 439–561 (2006)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: Densification and shrinking diameters. ACM TKDD 1(1), 2 (2007)CrossRefGoogle Scholar
  18. 18.
    Shetty, J., Adibi, J.: Enron email dataset. Technical report (2004)Google Scholar
  19. 19.
    Richardson, M., Agrawal, R., Domingos, P.: Trust Management for the Semantic Web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 351–368. Springer, Heidelberg (2003)Google Scholar
  20. 20.
    Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: WWW 2008 (2008)Google Scholar
  21. 21.
    Kendall, M., Gibbons, J.D.: Rank Correlation Methods, 5th edn. (September 1990)Google Scholar
  22. 22.
    Jaccard, P.: Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Arun S. Maiya
    • 1
  • Tanya Y. Berger-Wolf
    • 1
  1. 1.Department of Computer ScienceUniversity of Illinois at ChicagoChicagoUSA

Personalised recommendations