Skip to main content

Node Classification in Social Networks

  • Chapter
  • First Online:
Social Network Data Analytics

Abstract

When dealing with large graphs, such as those that arise in the context of online social networks, a subset of nodes may be labeled. These labels can indicate demographic values, interest, beliefs or other characteristics of the nodes (users). A core problem is to use this information to extend the labeling so that all nodes are assigned a label (or labels).

In this chapter, we survey classification techniques that have been proposed for this problem. We consider two broad categories: methods based on iterative application of traditional classifiers using graph information as features, and methods which propagate the existing labels via random walks. We adopt a common perspective on these methods to highlight the similarities between different approaches within and across the two categories. We also describe some extensions and related directions to the central problem of node classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. G. Aggarwal, N. Ailon, F. Constantin, E. Even-Dar, J. Feldman, G. Frahling, M. R. Henzinger, S. Muthukrishnan, N. Nisan, M. Pal, M. Sandler, and A. Sidiropoulos. Theory research at google. ACM SIGACT News archive, 39, 2008.

    Google Scholar 

  2. A. Azran. The rendezvous algorithm: Multiclass semi-supervised learning with markov random walks. In ICML, 2007.

    Google Scholar 

  3. S. Baluja, R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar, D. Ravichandran, and M. Aly. Video suggestion and discovery for youtube: Taking random walks through the view graph. In WWW, 2008.

    Google Scholar 

  4. Y. Bengio, O. Delalleau, and N. Le Roux. Label propagation and quadratic criterion. In O. Chapelle, B. Scholkopf, and A. Zien, editors, Semi-Supervised Learning, pages 193–216. MIT Press, 2006.

    Google Scholar 

  5. S. Bhagat, G. Cormode, and I. Rozenbaum. Applying link-based classification to label blogs. In Joint 9thWEBKDDand 1st SNA-KDDWorkshop, 2007.

    Google Scholar 

  6. S. Bhagat, S.Muthukrishnan, and D. Sivakumar. Hierarchical probabilistic node labeling, 2010. Manuscript.

    Google Scholar 

  7. A. Blum and S. Chawla. Learning from labeled and unlabeled data using graph mincuts. In ICML, 2001.

    Google Scholar 

  8. S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In ACM SIGMOD, 1998.

    Google Scholar 

  9. J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI, 2004.

    Google Scholar 

  10. N. Friedman, L. Getoor, D. Koller, and A. Pfeffer. Learning probabilistic relational models. In IJCAI, 1999.

    Google Scholar 

  11. A. B. Goldberg, X. Zhu, and S. Wright. Dissimilarity in graph-based semisupervised classification. In Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS), 2007.

    Google Scholar 

  12. G. H. Golub and C. F. Van Loan. Matrix computations (3rd ed.). Johns Hopkins University Press, 1996.

    Google Scholar 

  13. A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM, 2010.

    Google Scholar 

  14. G. Jeh and J. Widom. Scaling personalized web search. In WWW, 2003.

    Google Scholar 

  15. J. Kleinberg and E. Tardos. Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields. In FOCS, 1999.

    Google Scholar 

  16. D. Krushevskaja and S. Muthukrishnan. Inferring multi-labels on relationships, 2010. Manuscript.

    Google Scholar 

  17. A. N. Langville and C. D.Meyer. The use of linear algebra by web search engines. IMAGE Newsletter, 33:2–6, December 2004.

    Google Scholar 

  18. A. Lenhart and M. Madden. Teens, privacy and online social networks. http://www.pewinternet.org/Reports/2007/Teens-Privacy-and-Online-Social-Networks.aspx, 2007.

    Google Scholar 

  19. J. Leskovec, D. Huttenlocher, and J. Kleinberg. Predicting positive and negative links in online social networks. In WWW, 2010.

    Google Scholar 

  20. Q. Lu and L. Getoor. Link-based classification. In ICML, 2003.

    Google Scholar 

  21. S. A. Macskassy and F. Provost. A simple relational classifier. In MRDM Workshop, SIGKDD, 2003.

    Google Scholar 

  22. Sofus A. Macskassy and Foster Provost. Classification in networked data: A toolkit and a univariate case study. Journal of Machine Learning, 8:935–983, May 2007.

    Google Scholar 

  23. F. McSherry. Spectral partitioning of random graphs. In FOCS, 2001.

    Google Scholar 

  24. S. Muthukrishnan, B. Ghosh, and M. H. Schultz. First- and second-order diffusive methods for rapid, coarse, distributed load balancing. Theory Comput. Syst., 31(4), 1998.

    Google Scholar 

  25. S. Muthukrishnan and T. Suel. Second-order methods for distributed approximate single- and multicommodity flow. In RANDOM, 1998.

    Google Scholar 

  26. J. Neville and D. Jensen. Iterative classification in relational data. In Workshop on Learning Statistical Models from Relational Data, AAAI, 2000.

    Google Scholar 

  27. J. Neville and F. Provost. Predictive modeling with social networks. http://www.cs.purdue.edu/homes/neville/courses/icwsm09-tutorial.html, 2009.

    Google Scholar 

  28. A. D. Sarma, S. Gollapudi, and R. Panigrahy. Estimating pagerank on graph streams. In PODS, 2008.

    Google Scholar 

  29. P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. Collective classification in network data. AI Magazine, 29(3):93–106, 2008.

    Google Scholar 

  30. J. Shi and J. Malik. Normalized cuts and image segmentation. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997.

    Google Scholar 

  31. M. Szummer and T. Jaakkola. Partially labeled classification with markov random walks. In NIPS, 2001.

    Google Scholar 

  32. B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In UAI, 2002.

    Google Scholar 

  33. B. Taskar, E. Segal, and D. Koller. Probabilistic classification and clustering in relational data. In IJCAI, 2001.

    Google Scholar 

  34. J. S. Yedidia, W. T. Freeman, and Y. Weiss. Constructing free-energy approximations and generalized belief propagation algorithms. In IEEE Transactions on Information Theory, 2005.

    Google Scholar 

  35. W. W. Zachary. An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33:452–473, 1977.

    Google Scholar 

  36. D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In NIPS, 2004.

    Google Scholar 

  37. Y. Zhou, H. Cheng, and J. X. Yu. Graph clustering based on structural/attribute similarities. In VLDB, 2009.

    Google Scholar 

  38. X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In ICML, 2003.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Smriti Bhagat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Bhagat, S., Cormode, G., Muthukrishnan, S. (2011). Node Classification in Social Networks. In: Aggarwal, C. (eds) Social Network Data Analytics. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8462-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-8462-3_5

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-8461-6

  • Online ISBN: 978-1-4419-8462-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics