Data Mining and Knowledge Discovery

, Volume 30, Issue 1, pp 147–180 | Cite as

Link prediction using time series of neighborhood-based node similarity scores

  • İsmail Güneş
  • Şule Gündüz-Öğüdücü
  • Zehra Çataltepe
Article

Abstract

We propose a link prediction method for evolving networks. Our method first computes a number of different node similarity scores (e.g. Common Neighbor, Preferential Attachment, Adamic–Adar, Jaccard) and their weighted versions, for different past time periods. In order to predict the future node similarity scores, a powerful time series forecasting model, ARIMA, based on these past node similarity scores is used. This time series forecasting based approach enables link prediction based on modeling of the change of past node similarities and also external factors. The proposed link prediction method can be used for evolving networks and prediction of new or recurring links. We evaluate the link prediction performances of our proposed method and the previously proposed time series and similarity based link prediction methods under different circumstances by means of different AUC measures. We show that, the link prediction method proposed in this article results in a better performance than the previous methods.

Keywords

Network data Evolving networks Social networks  Link prediction Time series Node similarities 

References

  1. Acar E, Dunlavy DM, Kolda TG (2009) Link prediction on evolving data using matrix and tensor factorizations. In: ICDM Workshops. IEEE Computer Society, New York, pp 262–269Google Scholar
  2. Adamic L, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(20):211–230CrossRefGoogle Scholar
  3. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723MATHMathSciNetCrossRefGoogle Scholar
  4. Amjady N (2001) Short-term hourly load forecasting using time-series modeling with peak load estimation capability. IEEE Trans Power Syst 16(3):498–505CrossRefGoogle Scholar
  5. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang D-U (2006) Complex networks: structure and dynamics. Phys Rep 424(4–5):175–308MathSciNetCrossRefGoogle Scholar
  6. Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(17):107–117CrossRefGoogle Scholar
  7. Brockwell P, Davis R (1991) Time series, 2nd edn. Springer series in statistics. Springer, New YorkGoogle Scholar
  8. Chen AC-L, Gao S, Karampelas P, Alhajj R, Rokne JG (2011) Finding hidden links in terrorist networks by checking indirect links of different sub-networks. In: Wiil UK (ed) Counterterrorism and open source intelligence, vol 2., Lecture Notes in Social Networks. Springer, Berlin, pp 143–158Google Scholar
  9. Da Silva Soares PR, Prudencio RBC (2012) Time series based link prediction. In: The 2012 international joint conference on neural networks (IJCNN). IEEE, New York, pp 1–7Google Scholar
  10. Da Silva Soares PR, Prudencio RBC (2013) Proximity measures for link prediction based on temporal events. Exp Syst Appl 40(16):6652–6660CrossRefGoogle Scholar
  11. Erdős P, Renyi A (1960) On the evolution of random graphs. Publ Math Inst Hung Acad Sci 5:17Google Scholar
  12. Espinoza M, Joye C, Belmans R, DeMoor B (2005) Short-term load forecasting, profile identification, and customer segmentation: a methodology based on periodic time series. IEEE Trans Power Syst 20(3):1622–1630CrossRefGoogle Scholar
  13. Getoor L, Diehl CP (2005) Link mining: a survey. ACM SIGKDD Explor Newsl 7(2):3–12CrossRefGoogle Scholar
  14. Güneş İ, Çataltepe Z, Gündüz-Öğüdücü Ş (2014) Ga-tvrc-het: genetic algorithm enhanced time varying relational classifier for evolving heterogeneous networks. Data Min Knowl Discov 28:670–701MathSciNetCrossRefGoogle Scholar
  15. Hasan MA, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In: Proceedings of SDM 06 workshop on link analysis, counterterrorism and securityGoogle Scholar
  16. Hsu WH, King AL, Paradesi MSR, Pydimarri T, Weninger T (2006) Collaborative and structural recommendation of friends using weblog-based social network analysis. In: AAAI spring symposium on computational approaches to analyzing weblogs. AAAI, pp 55–60Google Scholar
  17. Hsu WH, Weninger T, Paradesi MSR (2008) Predicting links and link change in friends networks: supervised time series learning with imbalanced data. In: Proceedings of articial neural networks in engineeringGoogle Scholar
  18. Huang Z, Li X, Chen H (2005) Link prediction approach to collaborative filtering. In: Proceedings of the 5th ACM/IEEE-CS joint conference on digital libraries, JCDL ’05. ACM, New York, pp. 141–142Google Scholar
  19. Huang Z, Lin DKJ (2009) The time-series link prediction problem with applications in communication surveillance. INFORMS J Comput 21(2):286–303CrossRefGoogle Scholar
  20. Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43MATHCrossRefGoogle Scholar
  21. Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. ACM, New York, pp 177–187Google Scholar
  22. Liben-Nowell D, Kleinberg JM (2007) The link-prediction problem for social networks. JASIST 58(7):1019–1031CrossRefGoogle Scholar
  23. Lichtenwalter RN, Lussier JT, Chawla NV (2010) New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10. ACM, New York, pp 243–252Google Scholar
  24. Liu Z, Zhang Q-M, Lu L, Zhou T (2011) Link prediction in complex networks: a local nave bayes model, CoRR arXiv:1105.4005
  25. Lü L, Jin C, Zhou T (2009) Similarity index based on local paths for link prediction of complex networks. Phys Rev E 80(4):046122CrossRefGoogle Scholar
  26. Lü L, Zhou T (2010) Link prediction in weighted networks: the role of weak ties. EPL (Europhys Lett) 89:18001CrossRefGoogle Scholar
  27. Lu L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A 390(6):11501170MathSciNetCrossRefGoogle Scholar
  28. Mitzenmacher M (2004) A brief history of generative models for power law and lognormal distributions. Internet Math 1(2):226–251MATHMathSciNetCrossRefGoogle Scholar
  29. Moreno Y, Pastor-Satorras R, Vespignani A (2002) Epidemic outbreaks in complex heterogeneous networks. Eur Phys J B 26(4):521–529Google Scholar
  30. Murata T, Moriyasu S (2007) Link prediction of social networks based on weighted proximity measures. In: Proceedings of the IEEE/WIC/ACM international conference on web intelligence, WI ’07. IEEE Computer Society, Washington, pp 85–88Google Scholar
  31. Newman M (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102CrossRefGoogle Scholar
  32. O’Madadhain J, Hutchins J, Smyth P (2005) Prediction and ranking algorithms for event-based network data. SIGKDD Explor Newsl 7(2):23–30CrossRefGoogle Scholar
  33. Onnela J-P, Saramäki J, Hyvönen J, Szabó G, Lazer D, Kaski K, Kertész J, Barabási A-L (2007) Structure and tie strengths in mobile communication networks. Proc Natl Acad Sci 104(18):7332–7336CrossRefGoogle Scholar
  34. Potgieter A, April A, Cooke RJ, Osunmakinde IO (2009) Temporality in link prediction: wnderstanding social complexity. Emergence 11(1):68–93Google Scholar
  35. Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. PNAS 101(9):2658–2663CrossRefGoogle Scholar
  36. Salton G, McGill M (1983) Introduction to modern information retrieval. McGraw-Hill, New YorkMATHGoogle Scholar
  37. Schifanella R, Barrat A, Cattuto C, Markines B, Menczer F (2010) Folks in Folksonomies: social link prediction from shared metadata, In: Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10. ACM, New York, pp 271–280. doi:10.1145/1718487.1718521
  38. Spiegel S, Clausen JH, Albayrak S, Kunegis J (2011) Link prediction on evolving data using tensor factorization. In: Cao L, Huang JZ, Bailey J, Koh YS, Luo J (eds) PAKDD Workshops. Lecture Notes in Computer Science, vol 7104. Springer, Berlin, pp 100–110Google Scholar
  39. Taylor S (2007) Modelling financial time series, 2nd edn. World Scientific, SingaporeCrossRefGoogle Scholar
  40. Tylenda T, Angelova R, Bedathur SJ (2009) Towards time-aware link prediction in evolving social networks. In: Giles CL, Mitra P, Perisic I, Yen J, Zhang H (eds) SNAKDD. ACM, New York, p 9Google Scholar
  41. Wang C, Satuluri V, Parthasarathy S (2007) Local probabilistic models for link prediction. In: ICDM. IEEE Computer Society, New York, pp 322–331Google Scholar
  42. Wilcoxon F (1945) Individual comparisons by ranking methods. Biomet Bull 1:80–83CrossRefGoogle Scholar
  43. Xie X (2010) Potential friend recommendation in online social network. In: Green computing and communications (GreenCom), 2010 IEEE/ACM Int’l conference on & Int’l conference on cyber, physical and social computing (CPSCom). IEEE, New York, pp 831–835Google Scholar
  44. Zhu J, Hong J, Hughes JG (2002) Using markov models for web site link prediction. In: Blustein J, Allen RB, Anderson KM, Moulthrop S (eds) Hypertext. ACM, New York, pp 169–170Google Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  • İsmail Güneş
    • 1
  • Şule Gündüz-Öğüdücü
    • 1
  • Zehra Çataltepe
    • 1
  1. 1.Computer Engineering DepartmentIstanbul Technical UniversityIstanbulTurkey

Personalised recommendations