Skip to main content
Log in

Integrating node centralities, similarity measures, and machine learning classifiers for link prediction

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Link prediction is a widely studied topic in graph data analytics and finds numerous applications like friend recommendations in social networks and product recommendations in e-commerce. It refers to predicting new connections or edges that may arise in the near future amongst the nodes of the network. There exist various methods of link prediction generally based on either local, semi-local, or global features of networks and usually suffers from the problems of consistency in their performances over different and large size networks. In this paper, we intend to propose a generic and improved method of link prediction named as NSMLLP by integrating Node centralities, Similarity measures, and Machine Learning classifiers. We calculate popularity measures for every node and evaluate similarity measures for every pair of nodes in the network. The combined popularity and similarity measures form the features for every node pair in the network. The combined features of the nodes at the end of the edges, along with the positive or negative edge label, form a well-defined dataset for the task of link prediction. This dataset is then fed into machine learning classifiers like Random Forest classifier, AdaBoost classifier, and an ANN based classifier. The results obtained from these classifiers are then combined to make the final link prediction. We provide an information gain study aiming to quantify the improvement brought on by our proposed method. A feature importance study is also provided to comprehend better the relative importance of the various popularity and similarity measures used by us. The experimental results obtained on multiple real-life networks demonstrate that the proposed technique outperforms many existing popular methods of link predication based on several evaluation criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Abdel-Nasser M, Mahmoud K, Omer OA, Lehtonen M, Puig D (2020) Link quality prediction in wireless community networks using deep recurrent neural networks. Alexandria Eng J 59(5):3531–43

    Article  Google Scholar 

  2. Adamic L (2005) The Political Blogosphere and the 2004 U.S. Election: Divided They Blog. Proceedings of the 3rd International workshop on link discovery

  3. Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–30

    Article  Google Scholar 

  4. Ali A, Zhu Y, Chen Q, Yu J, Cai H (2019) Leveraging spatio-temporal patterns for predicting citywide traffic crowd flows using deep hybrid neural networks. In: 2019 IEEE 25th International conference on parallel and distributed systems (ICPADS). IEEE, pp 125–132

  5. Ali A, Zhu Y, Chen Q, Yu J, Cai H (2019) Leveraging spatio-temporal patterns for predicting citywide traffic crowd flows using deep hybrid neural networks. In: 2019 IEEE 25th International conference on parallel and distributed systems (ICPADS). IEEE, pp 125–132

  6. Ali A, Zhu Y, Zakarya M (2021) Exploiting dynamic spatio-temporal correlations for citywide traffic flow prediction using attention based neural networks. Inform Sci 577:852–70

    Article  MathSciNet  Google Scholar 

  7. Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl 19:1–33

    Google Scholar 

  8. Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl, 1–33

  9. Ali A, Zhu Y, Zakarya M (2022) Exploiting dynamic spatio-temporal graph convolutional neural networks for citywide traffic flows prediction. Neural Netw 145:233–47

    Article  Google Scholar 

  10. Aziz F, Gul H, Muhammad I, Uddin I (2020) Link prediction using node information on local paths. Physica A Stat Mechan Appl 557:124980

    Article  Google Scholar 

  11. Bae J, Kim S (2014) Identifying and ranking influential spreaders in complex networks by neighborhood coreness. Physica A Stat Mechan Appl 395:549–59

    Article  MathSciNet  MATH  Google Scholar 

  12. Batagelj V, Mrvar A (2006) Pajek datasets http://vlado.fmf.uni-lj.si/pub/networks/data/mix.USAir97.net

  13. Behera DK, Das M, Swetanisha S, Nayak J, Vimal S, Naik B (2021) Follower Link Prediction Using the XGBoost Classification Model with Multiple Graph Features. Wirel Pers Commun. 1-20

  14. Berahmand K, Nasiri E, Forouzandeh S, Li Y (2021) A preference random walk algorithm for link prediction through mutual influence nodes in complex networks. Journal of king saud university-computer and information sciences

  15. Berahmand K, Nasiri E, Rostami M, Forouzandeh S (2021) A modified DeepWalk method for link prediction in attributed social network. Computing 103:2227–2249

    Article  MathSciNet  Google Scholar 

  16. Biswas A, Biswas B (2017) Community-based link prediction. Multimed Tools Appl 76(18):18619–39

    Article  Google Scholar 

  17. Bonchi F, Castillo C, Gionis A, Jaimes A (2011) Social network analysis and mining for business applications. ACM Transactions on intelligent systems and technology (TIST) 2(3):1–37

    Article  Google Scholar 

  18. Breiman L (2001) Random forests. Machine learning 45(1):5–32

    Article  MATH  Google Scholar 

  19. Chebotarev P, Shamis E (2006) The matrix-forest theorem and measuring relations in small social groups. arXiv:math/0602070

  20. Das D (2018) Positive and negative link prediction algorithm based on sentiment analysis in large social networks. Wireless Personal Commun 102(3):2183–98

    Article  Google Scholar 

  21. Firth JA, Sheldon BC (2015) Experimental manipulation of avian social structure reveals segregation is carried over across contexts. Proc R Soc B Biol Sci 282(1802):20142350

    Article  Google Scholar 

  22. Forouzandeh S, Rostami M, Berahmand K (2021) Presentation a Trust Walker for rating prediction in recommender system with Biased Random Walk: Effects of H-index centrality, similarity in items and friends. Eng Appl Artif Intell 104:104325

    Article  Google Scholar 

  23. Gao M, Chen L, Li B, Liu W (2018) A link prediction algorithm based on low-rank matrix completion. Appl Intell 48(12):4531–50

    Article  Google Scholar 

  24. Ghorbanzadeh H, Sheikhahmadi A, Jalili M, Sulaimany S (2021) A hybrid method of link prediction in directed graphs. Expert Systems with Applications. 165:113896

    Article  Google Scholar 

  25. Gu S, Chen L, Li B, Liu W, Chen B (2019) Link prediction on signed social networks based on latent space mapping. Appl Intell 49(2):703–22

    Article  Google Scholar 

  26. Haghani S, Keyvanpour MR (2019) A systemic analysis of link prediction in social network. Artif Intell Rev 52(3):1961–1995

    Article  Google Scholar 

  27. Ibrahim NMA, Chen L (2015) Link prediction in dynamic social networks by integrating different types of information. Appl Intell 42(4):738–750

    Article  Google Scholar 

  28. Kaya B (2020) A hotel recommendation system based on customer location: a link prediction approach. Multimedia Tools and Appl 79(3):1745–58

    Article  Google Scholar 

  29. Kim J, Diesner J (2019) Formational bounds of link prediction in collaboration networks. Scientometrics 119(2):687–706

    Article  Google Scholar 

  30. Kumar S, Lohia D, Pratap D, Krishna A, Panda BS (2021) MDER: Modified degree with exclusion ratio algorithm for influence maximisation in social networks. Computing, 1–24

  31. Kumar S, Panda BS (2020) Identifying influential nodes in Social Networks: Neighborhood Coreness based voting approach. Physica A Stat Mechan Appl 553:124215

    Article  Google Scholar 

  32. Kumar S, Panda A (2021) Identifying influential nodes in weighted complex networks using an improved WVoterank approach. Appl Intell, 1–15

  33. Kumar S, Panda BS, Aggarwal D (2021) Community detection in complex networks using network embedding and gravitational search algorithm. J Intell Inform Syst 57(1):51–72

    Article  Google Scholar 

  34. Kumar S, Saini M, Goel M, Panda BS (2021) Modeling information diffusion in online social networks using a modified forest-fire model. J Intell Inform Syst 56(2):355–377

    Article  Google Scholar 

  35. Kumar A, Singh SS, Singh K, Biswas B (2019) Level-2 node clustering coefficient-based link prediction. Appl Intell 49(7):2762–79

    Article  Google Scholar 

  36. Kumar S, Singhla L, Jindal K, Grover K, Panda BS (2021) IM-ELPR: Influence Maximization in social networks using label propagation based community structure. Appl Intell 51:7647–7665

    Article  Google Scholar 

  37. Leicht EA, Holme P, Newman ME (2006) Vertex similarity in networks. Phys Rev E 73(2):026120

    Article  Google Scholar 

  38. Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: Densification and shrink- ing diameters. ACM transactions on Knowledge Discovery from Data (TKDD), 1(1)

  39. Li K, Tu L, Chai L (2020) Ensemble-model-based link prediction of complex networks. Comput Netw 166:106978

    Article  Google Scholar 

  40. Liu W, Lü L (2010) Link prediction based on local random walk. EPL (Europhysics Letters) 89(5):58007

    Article  Google Scholar 

  41. Martínez V, Berzal F, Cubero JC (2017) A survey of link prediction in complex networks. ACM Comput Surv (CSUR) 49(4):69

    Article  Google Scholar 

  42. Mistele T, Price T, Hossenfelder S (2019) Predicting authors’ citation counts and h-indices with a neural network. Scientometrics. 120(1):87–104

    Article  Google Scholar 

  43. Monteserin A, Armentano MG (2019) Influence me! Predicting links to influential users. Inform Retriev J 22(1):32–54

    Article  Google Scholar 

  44. Mutlu EC, Oghaz TA (2019) Review on graph feature learning and feature extraction techniques for link prediction. arXiv:1901.03425

  45. Nasiri E, Berahmand K, Li Y (2021) A new link prediction in multiplex networks using topologically biased random walks. chaos, Solitons. Fractals 151:111230

    Article  MATH  Google Scholar 

  46. Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking, Bringing order to the web. Stanford InfoLab

  47. Richardson M, Agrawal R, Domingos P (2003) Trust management for the semantic web. In: International semantic Web conference. Springer, Berlin, pp 351–368

  48. Rozemberczki B, Davies R, Sarkar R, Sutton C (2019) Gemsec: Graph embedding with self clustering. In: Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining, pp 65–72

  49. Salavati C, Abdollahpouri A, Manbari Z (2019) Ranking nodes in complex networks based on local structure and improving closeness centrality. Neurocomputing 336:36–45

    Article  Google Scholar 

  50. Salton G, McGill MJ (1983) Introduction to modern information retrieval. McGraw-Hill, Auckland

    MATH  Google Scholar 

  51. Stam CJ, Reijneveld JC (2007) Graph theoretical analysis of complex networks in the brain. Nonlinear Biomedical Physics 1(1):1–9

    Article  Google Scholar 

  52. Sun Z, Han L, Huang W, Wang X, Zeng X, Wang M, Yan H (2015) Recommender systems based on social networks. J Syst Softw 99:109–19

    Article  Google Scholar 

  53. Tripathi SP, Yadav RK, Rai AK, Tewari RR (2019) Hybrid approach for predicting and recommending links in social networks. In: Computational Intelligence: theories, Applications and Future Directions-Volume II. Springer, Singapore, pp 107–119

  54. Wahid-Ul-Ashraf A, Budka M, Musial K (2019) How to predict social relationships—Physics-inspired approach to link prediction. Physica A Stat Mechan Appl 523:1110–29

    Article  Google Scholar 

  55. Wang Z, Liang J, Li R (2018) A fusion probability matrix factorization framework for link prediction. Knowl-Based Syst 159:72–85

    Article  Google Scholar 

  56. Wang G, Wang Y, Li J, Liu K (2021) A Multidimensional Network Link Prediction Algorithm and Its Application for Predicting Social Relationships. Journal of Computational Science. 101358

  57. Wang W, Wu L, Huang Y, Wang H, Zhu R (2019) Link prediction based on deep convolutional neural network. Information 10(5):172

    Article  Google Scholar 

  58. Wang W, Wu L, Huang Y, Wang H, Zhu R (2019) Link prediction based on deep convolutional neural network. Information 10(5):172

    Article  Google Scholar 

  59. Wen T, Deng Y (2020) Identification of influencers in complex networks by local information dimensionality. Inform Sci 512:549–62

    Article  Google Scholar 

  60. White JG, Southgate E, Thomson JN, Brenner S (1986) The structure of the nervous system of the nematode Caenorhabditis elegans. Philos Trans R Soc Lond B Biol Sci 314(1165):1–340

    Article  Google Scholar 

  61. Wu J, Shen J, Zhou B, Zhang X, Huang B (2019) General link prediction with influential node identification. Physica A Stat Mechan Appl 523:996–1007

    Article  Google Scholar 

  62. Wu X, Wu J, Li Y, Zhang Q (2020) Link prediction of time-evolving network based on node ranking. Knowl-Based Syst 195:105740

    Article  Google Scholar 

  63. Yadav RK, Rai AK (2020) Incorporating communities’ structures in predictions of missing links. J Intell Inform Syst 55:183–205

    Article  Google Scholar 

  64. Yao L, Wang L, Pan L, Yao K (2016) Link prediction based on common-neighbors for dynamic social network. Procedia Computer Science 83:82–9

    Article  Google Scholar 

  65. Zeng A, Zhang CJ (2013) Ranking spreaders by decomposing complex networks. Phys Lett A 377(14):1031–1035

    Article  Google Scholar 

  66. Zhang L, Zhao M, Zhao D (2020) Bipartite graph link prediction method with homogeneous nodes similarity for music recommendation. Multimed Tools Appl 79(19):13197–215

    Article  Google Scholar 

  67. Zhou W, Gu J, Jia Y (2018) H-Index-based link prediction methods in citation network. Scientometrics 117(1):381–90

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanjay Kumar.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Anand, S., Rahul, Mallik, A. et al. Integrating node centralities, similarity measures, and machine learning classifiers for link prediction. Multimed Tools Appl 81, 38593–38621 (2022). https://doi.org/10.1007/s11042-022-12854-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12854-8

Keywords

Navigation