Abstract
Link prediction is a classic complex network analytical problem to predict the possible links according to the known network structure information. Considering similar nodes should present closer embedding vectors with network representation learning, in this paper, we propose a Graph ATtention network method based on node Similarity (SiGAT) for link prediction. Specifically, we calculate similar node set for each node in the network by traditional method. The similar nodes and first-order neighbors are assigned an optimal weight through the graph attention network mechanism. Then, we obtain the embedding vectors of nodes with aggregating the information of the similar nodes and first-order neighbor nodes. By incorporating similar nodes, the node embeddings preserve more structure information of the network in low-dimensional embedding space. Finally, the SiGAT represents the links between pairs of nodes with concatenating the node embedding vectors and then trains a classifier to predict novel potential network links. The results of experiments on five real datasets and large-scale artificial datasets, which are the Yeast dataset, Cora dataset, BIO-CE-HT dataset, Human proteins (Vidal) dataset, Human proteins (Stelzl) dataset, and LFR benchmark datasets, show that the SiGAT outperforms the existing popular approaches.
Similar content being viewed by others
Data availability statement
This manuscript has associated data in a data repository. Readers can find the main code and datasets for this paper at the following websites. https://github.com/wefwfrfg/SiGAThttp://konect.cc/networks/https://networkrepository.com/bio.phphttps://linqs-data.soe.ucsc.edu/public/lbc/cora.tgz.
References
L. Lü, T. Zhou, Link prediction in complex networks: a survey. Physica A 390(6), 1150–1170 (2011)
K. Berahmand, E. Nasiri, S. Forouzandeh, Y. Li, A preference random walk algorithm for link prediction through mutual influence nodes in complex networks. Eur. J. Inform. Syst. 34(8), 5375–5387 (2022)
E. Nasiri, K. Berahmand, Y. Li, Robust graph regularization nonnegative matrix factorization for link prediction in attributed networks. Multim. Tools. Appl. 82(3), 3745–3768 (2023)
J. Vamathevan, D. Clark, P. Czodrowski, I. Dunham, E. Ferran, G. Lee, B. Li, A. Madabhushi, P. Shah, M. Spitzer et al., Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18(6), 463–477 (2019)
V. Agarwal, K.K. Bharadwaj, A collaborative filtering framework for friends recommendation in social networks based on interaction intensity and adaptive user similarity. Soc. Netw. Anal. Min. 3(3), 359–379 (2013)
T. Zhou, L. Lü, Y.-C. Zhang, Predicting missing links via local information. Eur. Phys. J. B 71(4), 623–630 (2009)
L. Yao, L. Wang, L. Pan, K. Yao, Link prediction based on common-neighbors for dynamic social network. Procedia Comput. Sci. 83, 82–89 (2016)
E. Nasiri, K. Berahmand, Z. Samei, Y. Li, Impact of centrality measures on the common neighbors in link prediction for multiplex networks. Big Data 10(2), 138–150 (2022)
W. Zhou, J. Gu, Y. Jia, h-index-based link prediction methods in citation network. Scientometrics 117(1), 381–390 (2018)
V. Martínez, F. Berzal, J.-C. Cubero, A survey of link prediction in complex networks. ACM Comput. Surv. 49(4), 1–33 (2016)
L. Katz, A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
G. Nikolentzos, M. Vazirgiannis, Random walk graph neural networks. Adv. Neural Inf. Process. Syst. 33, 16211–16222 (2020)
L. Page, S. Brin, R. Motwani, T. Winograd, The pagerank citation ranking: Bringing order to the web (Tech. rep, Stanford InfoLab, 1999)
H. Tong, C. Faloutsos, J.-Y. Pan, Fast random walk with restart and its applications. In Proceedings of the Sixth International Conference on Data Mining, pp. 613–622 (2006)
S. Pal, Y. Dong, B. Thapa, N.V. Chawla, A. Swami, R. Ramanathan, Deep learning for network analysis: problems, approaches and challenges. In: Proceedings of the IEEE Military Communications Conference, pp. 588–593 (2016)
B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
P. Cui, X. Wang, J. Pei, W. Zhu, A survey on network embedding. IEEE Trans. Knowl. Data. Eng. 31(5), 833–852 (2018)
M. Coşkun, M. Koyutürk, Node similarity-based graph convolution for link prediction in biological networks. Bioinformatics 37(23), 4501–4508 (2021)
T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 (2016)
J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M. Sun, Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)
X. Xian, L. Fang, S. Sun, Regnn: a repeat aware graph neural network for session-based recommendations. IEEE Access 8, 98518–98525 (2020)
D. Liben-Nowell, J. Kleinberg, The link-prediction problem for social networks. J. Assoc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)
S. Bhagat, G. Cormode, S. Muthukrishnan, Node classification in social networks. Soc. Netw. Anal. Min. 115–148 (2011)
P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks. arXiv:1710.10903 (2017)
R. Real, J.M. Vargas, The probabilistic basis of Jaccard’s index of similarity. Syst. Biol. 45(3), 380–385 (1996)
M. Pal, Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005)
M. Zhang, Y. Chen, Link prediction based on graph neural networks. Adv. Neural Inf. Process. Syst. 31 (2018)
W. Shen, Y. Chen, Y. Cheng, K. Yang, X. Guo, Y. Sun, Y. Chen, An improved deep-learning model for road extraction from very-high-resolution remote sensing images. In: Proceedings of the IEEE International Symposium on Geoscience and Remote Sensing, pp. 4660–4663 (2021)
Z. Yang, M. Ding, C. Zhou, H. Yang, J. Zhou, J. Tang, Understanding negative sampling in graph representation learning. arXiv:2005.09863 (2020)
X. Xu, B. Liu, J. Wu, L. Jiao, Link prediction in complex networks via matrix perturbation and decomposition. Sci. Rep. 7(1), 1–9 (2017)
X. Huang, J. Li, X. Hu, Accelerated attributed network embedding, in: Proc. SIAM Int. Conf. Data Mining., pp. 633–641 (2017)
J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, Line: Large-scale information network embedding, in: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077 (2015)
A.P. Bradley, The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)
I. Yaniv, D.P. Foster, Precision and accuracy of judgmental estimation. J. Behav. Decis. Mak. 10(1), 21–32 (1997)
J. Davis, M. Goadrich, The relationship between precision-recall and roc curves, in: Proc. 23rd Int. Conf. Machine Learning, pp. 233–240 (2006)
Z. C. Lipton, C. Elkan, B. Narayanaswamy, Thresholding classifiers to maximize f1 score, arXiv:1402.1892 (2014)
J. Kunegis, KONECT – The Koblenz Network Collection, in: Proceedings of the 22nd International Conference on World Wide Web, pp. 1343–1350 (2013)
R. A. Rossi, N. K. Ahmed, The network data repository with interactive graph analytics and visualization, in: Proceedings of the 29th AAAI Conference on Artificial Intelligence, (2015)
A.K. McCallum, K. Nigam, J. Rennie, K. Seymore, Automating the construction of internet portals with machine learning. J. Inf. Sci. 3(2), 127–163 (2000)
K. Diederik, B. Jimmy, et al., Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)
A. Mayr, B. Hofner, M. Schmid, The importance of knowing when to stop. Methods Inf. Med. 51(02), 178–186 (2012)
A. Lancichinetti, S. Fortunato, F. Radicchi, Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)
A.J. Scott, M. Knott, A cluster analysis method for grouping means in the analysis of variance. Biometrics 507–512 (1974)
F.O. Isinkaye, Y.O. Folajimi, B.A. Ojokoh, Recommendation systems: principles, methods and evaluation. Egypt. Inform. J. 16(3), 261–273 (2015)
C. Shi, Y. Li, J. Zhang, Y. Sun, S.Y. Philip, A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data. Eng. 29(1), 17–37 (2016)
Acknowledgements
The authors acknowledge anonymous reviewers for their time and effort in reviewing this paper. This work is supported in part by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (No. 22KJD120002).
Author information
Authors and Affiliations
Contributions
KY and YL: conceptualization, methodology, data curation, writing, visualization, and investigation. ZZ, XZ and PD: supervision, reviewing, and editing.
Corresponding author
Additional information
This work was partially supported by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant No. 22KJD120002.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, K., Liu, Y., Zhao, Z. et al. Graph attention network via node similarity for link prediction. Eur. Phys. J. B 96, 27 (2023). https://doi.org/10.1140/epjb/s10051-023-00495-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1140/epjb/s10051-023-00495-1