Social Network Analysis and Mining

, Volume 3, Issue 2, pp 127–141

Supervised methods for multi-relational link prediction

  • Darcy Davis
  • Ryan Lichtenwalter
  • Nitesh V. Chawla
Original Article

Abstract

Many important real-world systems, modeled naturally as complex networks, have heterogeneous interactions and complicated dependency structures. Link prediction in such networks must model the influences between heterogenous relationships and distinguish the formation mechanisms of each link type, a task which is beyond the simple topological features commonly used to score potential links. In this paper, we introduce a novel probabilistically weighted extension of the Adamic/Adar measure for heterogenous information networks, which we use to demonstrate the potential benefits of diverse evidence, particularly in cases where homogeneous relationships are very sparse. However, we also expose some fundamental flaws of traditional unsupervised link prediction. We develop supervised learning approaches for relationship (link) prediction in multi-relational networks, and demonstrate that a supervised approach to link prediction can enhance performance. We present results on three diverse, real-world heterogeneous information networks and discuss the trends and tradeoffs of supervised and unsupervised link prediction in a multi-relational setting.

References

  1. Adamic L, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230CrossRefGoogle Scholar
  2. Barabási A (2003) Linked: how everything is connected to everything else and what it means. Penguin Group, New YorkGoogle Scholar
  3. Barabási A, Jeong H, Néda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Phys A Stat Mech Appl 311(3-4):590–614MATHCrossRefGoogle Scholar
  4. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MathSciNetMATHGoogle Scholar
  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32MATHCrossRefGoogle Scholar
  6. Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117CrossRefGoogle Scholar
  7. Christakis N, Fowler J (2007) The spread of obesity in a large social network over 32 years. New Engl J Med 357(4):370–379CrossRefGoogle Scholar
  8. Christakis N, Fowler J (2008) The collective dynamics of smoking in a large social network. New Engl J Med 358(21):2249–2258CrossRefGoogle Scholar
  9. Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning ACM, pp 233–240Google Scholar
  10. Friedkin N, Johnsen E (1990) Social influence and opinions. J Math Sociol 15(3):193–206MATHCrossRefGoogle Scholar
  11. Goadrich M, Oliphant L, Shavlik J (2004) Learning ensembles of first-order clauses for recall-precision curves: a case study in biomedical information extraction. In: Inductive logic programming, pp 421–456Google Scholar
  12. Han J (2009) Mining heterogeneous information networks by exploring the power of links. In: Discovery science, Springer, pp 13–30Google Scholar
  13. Holland P, Leinhardt S (1970) A method for detecting structure in sociometric data. Am J Sociol 76(3):492–513CrossRefGoogle Scholar
  14. Huang Z, Li X, Chen H (2005) Link prediction approach to collaborative filtering. In: Proceedings of the 5th ACM/IEEE-CS joint conference on digital libraries, ACM, pp 141–142Google Scholar
  15. Kas M, Carley K, Carley L (2012) Trends in science networks: understanding structures and statistics of scientific networks. In: Social network analysis and mining, pp 1–19Google Scholar
  16. Kautz H, Selman B, Shah M (1997) Referral Web: combining social networks and collaborative filtering. Commun ACM 40(3):63–65CrossRefGoogle Scholar
  17. Liben-Nowell D, Kleinberg J (2003) The link-prediction problem for social networks. In: Proceedings of the 12th international conference on information and knowledge management, pp 556–559Google Scholar
  18. Lichtenwalter R, Lussier J, Chawla N (2010) New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 243–252Google Scholar
  19. McPherson M, Smith-Lovin L, Cook J (2001) Birds of a feather: Homophily in social networks. Ann Rev Sociol 27:415–444CrossRefGoogle Scholar
  20. Newman M (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102Google Scholar
  21. Newman M (2001) The structure of scientific collaboration networks. Proc Natl Acad Sci 98(2):404MathSciNetMATHCrossRefGoogle Scholar
  22. O’Madadhain J, Hutchins J, Smyth P (2005) Prediction and ranking algorithms for event-based network data. ACM SIGKDD Explor Newslett 7(2):23–30CrossRefGoogle Scholar
  23. Pelan A, Steinhaeuser K, Chawla NV, de Alwis Pitts DA, Ganguly AR (2011) Empirical comparison of correlation measures and pruning levels in complex networks representing the global climate system. In: IEEE symposium series on computational intelligence and data miningGoogle Scholar
  24. Pržulj N, Corneil D, Jurisica I (2004) Modeling interactome: scale-free or geometric? Bioinformatics 20(18):3508CrossRefGoogle Scholar
  25. Radivojac P, Peng K, Clark W, Peters B, Mohan A, Boyle S, Mooney S (2008) An integrated approach to inferring gene–disease associations in humans. Proteins Struct Funct Bioinform 72(3):1030–1037CrossRefGoogle Scholar
  26. Raeder T, Chawla N (2011) Market basket analysis with networks. In: Social network analysis and mining, pp 1–17Google Scholar
  27. Raghavan V, Bollmann P, Jung G (1989) A critical investigation of recall and precision as measures of retrieval system performance. ACM Trans Inform Syst (TOIS) 7(3):205–229CrossRefGoogle Scholar
  28. Rual J, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz G, Gibbons F, Dreze M, Ayivi-Guedehoussou N et al (2005) Towards a proteome-scale map of the human protein–protein interaction network. Nature 437(7062):1173–1178CrossRefGoogle Scholar
  29. Salton G, McGill M (1983) Introduction to modern information retrieval. McGraw-Hill, New YorkGoogle Scholar
  30. Scott J (2011) Social network analysis: developments, advances, and prospects. In: Social network analysis and mining, pp 1–6Google Scholar
  31. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck F, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S et al (2005) A human protein-protein interaction network: a resource for annotating the proteome. Cell 122(6):957–968CrossRefGoogle Scholar
  32. Tang L, Wang X, Liu H (2009) Uncovering groups via heterogeneous interaction analysis. In: Proceedings of the 9th IEEE international conference on data mining, pp 503–512Google Scholar
  33. Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge university press, CambridgeGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Darcy Davis
    • 1
  • Ryan Lichtenwalter
    • 1
  • Nitesh V. Chawla
    • 1
  1. 1.University of Notre DameNotre DameUSA

Personalised recommendations