Skip to main content

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

We develop node embeddings, a distributed representation of nodes, for large-scale social network applications. We compute embeddings for nodes based on their attributes and links. We show that node embeddings can effectively reflect community structure in networks and thus, be useful for a wide range of community related applications. We consider node embeddings in two different community related mining tasks.First, we propose a generic integration of node embeddings for network processing in community detection algorithms. Our strategy aims to re-adjust input networks by adding and trimming links, using embedding-based node distances. We empirically show that the strategy can remove up to 32.16% links from the DBLP (computer science literature) citation network, yet improve performance for different algorithms by different evaluation metrics for community detections.Second, we show that these embeddings can support many community-based mining tasks in social networks—including analyses of community homogeneity, distance, and detection of community connectors (inter-community outliers, actors who connect communities)—thanks to the convenient yet efficient computation provided by node embeddings for structural comparisons. Our experimental results include many interesting insights about DBLP. For example, prior to 2013 the best way for research in Natural Language & Speech to gain “best-paper” recognition was to emphasize aspects related to Machine Learning & Pattern Recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bengio S, Heigold G. Word embeddings for speech recognition. In: Proceedings of the 15th conference of the international speech communication association, Interspeech; 2014.

    Google Scholar 

  2. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of community hierarchies in large networks. CoRR. 2008;abs/0803.0476.

    Google Scholar 

  3. Chen W, Zhang M, Zhang Y. Distributed feature representations for dependency parsing. IEEE Trans Audio Speech Lang Process. 2015;23(3):451–60.

    Article  Google Scholar 

  4. Clauset A, Shalizi CR, Newman MEJ. Power-law distributions in empirical data. SIAM Rev. 2009;51(4):661–703.

    Article  MathSciNet  MATH  Google Scholar 

  5. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. J Mach Learn Res. 2011;12:2493–2537.

    MATH  Google Scholar 

  6. Feingold E, Good P. Encode pilot project; 2003. http://www.genome.gov/26525202.

    Google Scholar 

  7. Fortunato S, Lancichinetti A. Community detection algorithms: a comparative analysis: invited presentation, extended abstract. In: Proceedings of the fourth international ICST conference on performance evaluation methodologies and tools, VALUETOOLS ’09. Brussels: ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering); 2009. p. 27:1–2.

    Google Scholar 

  8. Gehrke P. The ethics and politics of speech: communication and rhetoric in the twentieth century. Carbondale: Southern Illinois University Press; 2009.

    Google Scholar 

  9. Goyal A, Bonchi F, Lakshmanan LVS. Approximation analysis of influence spread in social networks. CoRR. 2010;abs/1008.2005.

    Google Scholar 

  10. Han J. Data mining: concepts and techniques. San Francisco, CA: Morgan Kaufmann; 2005.

    Google Scholar 

  11. Hannun AY, Case C, Casper J, Catanzaro BC, Diamos G, Elsen E, Prenger R, Satheesh S, Sengupta S, Coates A, Ng AY. Deep speech: scaling up end-to-end speech recognition. CoRR. 2014;abs/1412.5567.

    Google Scholar 

  12. Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527–54

    Article  MathSciNet  MATH  Google Scholar 

  13. Levy O, Goldberg Y. Dependency-based word embeddings. Baltimore, MD: Association for Computational Linguistics; 2014.

    Book  Google Scholar 

  14. Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger K, editors. Advances in neural information processing systems, vol. 27. Red Hook, NY: Curran Associates; 2014, p. 2177–85.

    Google Scholar 

  15. Li L, Su H, Lim Y, Li F. Object bank: an object-level image representation for high-level visual recognition. Int J Comput Vis 2014;107(1):20–39.

    Article  Google Scholar 

  16. Mikolov T, Chen K, Corrado G. Dean J. Efficient estimation of word representations in vector space. CoRR. 2013;abs/1301.3781.

    Google Scholar 

  17. Newman MEJ. Analysis of weighted networks. Phys. Rev. E 2004;70:056131

    Article  Google Scholar 

  18. Newman MEJ. Fast algorithm for detecting community structure in networks. Phys. Rev. E 2004;69:066133.

    Article  Google Scholar 

  19. Newman MEJ. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 2006;74:036104

    Article  MathSciNet  Google Scholar 

  20. Newman MEJ, Girvan M. Finding and evaluating community structure in networks. Phys. Rev. E 2004;69:026113.

    Article  Google Scholar 

  21. Orman GK, Labatut V, Cherifi H. On accuracy of community structure discovery algorithms. CoRR. 2011;abs/1112.4134.

    Google Scholar 

  22. Pons P, Latapy M. Computing communities in large networks using random walks (long version). In: Computer and Information Sciences-ISCIS; 2005. p. 284–93. ArXiv:arXiv:physics/0512106v1.

    Google Scholar 

  23. Riondato M, Kornaropoulos EM. Fast approximation of betweenness centrality through sampling. In: WSDM ’14; 2014.

    Google Scholar 

  24. Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci USA. 2008;2007:1118.

    Article  Google Scholar 

  25. Ruan, Y., Fuhry, D., Parthasarathy, S.: Efficient community detection in large networks using content and links. CoRR. 2012;abs/1212.0146.

    Google Scholar 

  26. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C. Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Stroudsburg, PA: Association for Computational Linguistics; 2013. p. 1631–42.

    Google Scholar 

  27. Sun Y, Han J. Mining heterogeneous information networks: principles and methodologies. San Rafael, CA: Morgan & Claypool; 2012.

    Google Scholar 

  28. Sun Y, Yu Y, Han J. Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’09; 2009. p. 797–806.

    Google Scholar 

  29. Tian Y, Hankins RA, Patel JM. Efficient aggregation for graph summarization. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, SIGMOD ’08; 2008. p. 567–80.

    Google Scholar 

  30. Vu T, Parker DS. Node embeddings in social network analysis. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ASONAM ’15; 2015. p. 326–9.

    Google Scholar 

  31. Yang T, Jin R, Chi Y, Zhu S. Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’09; 2009. p. 927–36.

    Google Scholar 

  32. Zhou Y, Cheng H, Yu JX. Graph clustering based on structural/attribute similarities. Proc VLDB Endow. 2009;2(1):718–729.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thuy Vu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Vu, T., Parker, D.S. (2017). Mining Community Structure with Node Embeddings. In: Kaya, M., Erdoǧan, Ö., Rokne, J. (eds) From Social Data Mining and Analysis to Prediction and Community Detection. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-51367-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51367-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51366-9

  • Online ISBN: 978-3-319-51367-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics