Link prediction in directed social networks

Original Article

Abstract

In today’s online social networks, it becomes essential to help newcomers as well as existing community members to find new social contacts. In scientific literature, this recommendation task is known as link prediction. Link prediction has important practical applications in social network platforms. It allows social network platform providers to recommend friends to their users. Another application is to infer missing links in partially observed networks. The shortcoming of many of the existing link prediction methods is that they mostly focus on undirected graphs only. This work closes this gap and introduces link prediction methods and metrics for directed graphs. Here, we compare well-known similarity metrics and their suitability for link prediction in directed social networks. We advance existing techniques and propose mining of subgraph patterns that are used to predict links in networks such as GitHub, GooglePlus, and Twitter. Our results show that the proposed metrics and techniques yield more accurate predictions when compared with metrics not accounting for the directed nature of the underlying networks.

Keywords

Social networks Link prediction Metrics Directed graphs Patterns 

References

  1. Adamic LA, Adar E (2001) Friends and neighbors on the web. Soc Netw 25:211–230CrossRefGoogle Scholar
  2. Aiello LM, Barrat A, Schifanella R, Cattuto C, Markines B, Menczer F (2012) Friendship prediction and homophily in social media. ACM Trans Web 6(2):9:1–9:33CrossRefGoogle Scholar
  3. Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014MATHGoogle Scholar
  4. Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8(6):450–461CrossRefGoogle Scholar
  5. Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the 4th ACM international conference on Web search and data mining, WSDM ’11. ACM, New York, NY, pp 635–644Google Scholar
  6. Batagelj V, Mrvar AA (2001) A subquadratic triad census algorithm for large sparse networks with small maximum degree. Soc Netw 23(3):237–243CrossRefGoogle Scholar
  7. Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159CrossRefGoogle Scholar
  8. Brzozowski MJ, Romero DM (2011) Who should i follow? recommending people in directed social networks. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM. The AAAI Press, Menlo Park, CAGoogle Scholar
  9. Clauset A, Moore C, Newman MEJ (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101CrossRefGoogle Scholar
  10. Esslimani I, Brun A, Boyer A (2011) Densifying a behavioral recommender system by social networks link prediction methods. Soc Netw Anal Min 1(3):159–172CrossRefGoogle Scholar
  11. Facebook. Online: http://facebook.com (last access 22 Feb 2013)
  12. GitHub. Online: http://github.com (last access 22 Feb 2013)
  13. GitHub. Online: http://developer.github.com/ (last access 22 Feb 2013)
  14. GooglePlus. Online: http://plus.google.com/ (last access 22 Feb 2013)
  15. Granovetter M (1973) The strength of weak ties. Am J Sociol 78(6):1360–1380CrossRefGoogle Scholar
  16. Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: first steps. Soc Netw 5(2):109–137CrossRefMathSciNetGoogle Scholar
  17. Holland PW, Leinhardt S (1970) A method for detecting structure in sociometric data. Am J Sociol 76(3):492–513CrossRefGoogle Scholar
  18. Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD’02. ACM, New York, NY, pp 538–543Google Scholar
  19. Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, WWW ’03. ACM, New York, NY, pp 271–279Google Scholar
  20. Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43CrossRefMATHGoogle Scholar
  21. Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of the 19th international conference on World wide web, WWW ’10. ACM, New York, NY, pp 591–600Google Scholar
  22. Leicht EA, Holme P, Newman MEJ (2006) Vertex similarity in networks. Phys Rev E 73:026120CrossRefGoogle Scholar
  23. Leskovec J, Huttenlocher D, Kleinberg J (2010) Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on World wide web, WWW ’10. ACM, New York, NY, pp 641–650Google Scholar
  24. Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: Proceedings of the twelfth international conference on information and knowledge management, CIKM ’03. ACM, New York, NY, pp 556–559Google Scholar
  25. Liu W, Lu L (2010) Link prediction based on local random walk. Europhys Lett (EPL) 89(5):58007CrossRefGoogle Scholar
  26. Lu L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170CrossRefGoogle Scholar
  27. McAuley JJ, Leskovec J (2012) Learning to discover social circles in ego networks. In: Bartlett PL, Pereira FCN, Burges CJC, Bottou L, Weinberger KQ (eds) NIPS. pp 548–556Google Scholar
  28. Meng B, Ke H, Yi T (2011) Link prediction based on a semi-local similarity index. Chin Phys B 20(12):128902CrossRefGoogle Scholar
  29. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827CrossRefGoogle Scholar
  30. Page L, Brin S, Motwani R, Winograd T (1998) The PageRank citation ranking: bringing order to the web. Technical Report, Stanford UniversityGoogle Scholar
  31. Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551–1555CrossRefGoogle Scholar
  32. Rettinger A, Wermser H, Huang Y, Tresp V (2012) Context-aware tensor decomposition for relation prediction in social networks. Soc Netw Anal Min 2(4):373–385CrossRefGoogle Scholar
  33. Romero DM, Kleinberg JM (2010) The directed closure process in hybrid social-information networks, with an analysis of link formation on twitter. In: Cohen WW, Gosling S (eds) ICWSM. The AAAI Press, Menlo Park, CAGoogle Scholar
  34. Salton G, McGill MJ (1986) Introduction to modern Information retrieval. McGraw-Hill, Inc., New York, NYGoogle Scholar
  35. Sautter G, Bhm K (2013) High-throughput crowdsourcing mechanisms for complex tasks. Soc Netw Anal Min 3(4):873–888CrossRefGoogle Scholar
  36. Schall D (2012) Expertise ranking using activity and contextual link measures. Data Knowl Eng 71(1):92–113CrossRefGoogle Scholar
  37. Schall D (2012) Service oriented crowdsourcing: architecture, protocols and algorithms. Springer Briefs in Computer Science. Springer, New York, NYGoogle Scholar
  38. Schall D, Skopik F (2012) Social network mining of requester communities in crowdsourcing markets. Soc Netw Anal Min 2(4):329–344CrossRefGoogle Scholar
  39. Snijders TA (2012) Transitivity and Triads. University of Oxford. Online: http://www.stats.ox.ac.uk/snijders/Trans_Triads_ha.pdf (last access 22-Feb-2013)
  40. Sørensen T (1957) A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on danish commons. Biologiske Skrifter/Kongelige Danske Videnskabernes Selskab 5(4):1–34Google Scholar
  41. Stanford. Online: http://snap.stanford.edu/data/index.html (last access 22 Feb 2013)
  42. Symeonidis P, Mantas N (2013) Spectral clustering for link prediction in social networks with positive and negative links. Soc Netw Anal Min 3(4):1433–1447CrossRefGoogle Scholar
  43. Twitter. Online: http://twitter.com (last access 22 Feb 2013)
  44. Wasserman S, Faust K, Iacobucci D (1994) Social network analysis: methods and applications (structural analysis in the social sciences). Cambridge University Press, CambridgeGoogle Scholar
  45. White HC, Boorman SA, Breiger RL (1976) Social structure from multiple networks. i. blockmodels of roles and positions. Am J Sociol 81(4):730–780CrossRefGoogle Scholar
  46. Zhou T, Lu L, Zhang Y-C (2009) Predicting missing links via local information. Eur Phys J B Condens Matter Complex Syst 71(4):623–630CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Wien 2014

Authors and Affiliations

  1. 1.Siemens Corporate TechnologyViennaAustria

Personalised recommendations