Supervised Link Prediction Using Random Walks
Network structure has become increasingly popular in big-data representation over the last few years. As a result, network based analysis techniques are applied to networks containing millions of nodes. Link prediction helps people to uncover the missing or unknown links between nodes in networks, which is an essential task in network analysis.
Random walk based methods have shown outstanding performance in such task. However, the primary bottleneck for such methods is adapting to networks with different structure and dynamics, and scaling to the network magnitude. Inspired by Random Walk with Restart (RWR), a promising approach for link prediction, this paper proposes a set of path based features and a supervised learning technique, called Supervised Random Walk with Restart (SRWR) to identify missing links. We show that by using these features, a classifier can successfully order the potential links by their closeness to the query node. A new type of heterogeneous network, called Generalized Bi-relation Netowrk (GBN), is defined in this paper, upon which the novel structural features are introduced. Finally experiments are performed on a disease-chemical-gene interaction network, whose result shows SRWR significantly outperforms standard RWR algorithm in terms of the Area Under ROC Curve (AUC) gained and better than or equal to the best algorithms in the field of gene prioritization.
KeywordsSupport Vector Machine Heterogeneous Network Link Prediction Candidate Node Query Node
This material is supported by National Institutes of Health under the grant number R01LM011986. The content of the information in this document does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on. This work is also supported in part by the National High-Technology Research and Development Program (863 Program) of China under Grand 2013AA01A212, National Science Foundation Grant 61272067, 61370229 and Jiaying University Grant (“Collaboration Mechanism and Application in Social Networks.”).
- 1.Backstrom, L., Leskovec, J.: Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, pp. 635–644. ACM, New York (2011)Google Scholar
- 6.Cukierski, W., Hamner, B., Yang, B.: Graph-based features for supervised link prediction. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 1237–1244, July 2011Google Scholar
- 8.Fire, M., Tenenboim, L., Lesser, O., Puzis, R., Rokach, L., Elovici, Y.: Link prediction in social networks using computationally efficient topological features. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), pp. 73–80, October 2011Google Scholar
- 9.Hasan, M.A., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In: Proceedings of SDM 2006 Workshop on Link Analysis. Counterterrorism and Security (2006). 00358Google Scholar
- 11.Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et du Jura. Bulletin de la Societe Vaudoise des Sciences Naturelles 37(142), 547–579 (1901)Google Scholar
- 13.Lao, N., Cohen, W.W.: Fast query execution for retrieval models based on path-constrained random walks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 881–888. ACM, New York (2010)Google Scholar
- 15.Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management, CIKM 2003, pp. 556–559. ACM, New York (2003)Google Scholar
- 17.Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web (1999)Google Scholar
- 18.Pan, J.Y., Yang, H.J., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 653–658. ACM, New York (2004)Google Scholar
- 20.Tong, H., Faloutsos, C., Pan, J.Y.: Fast random walk with restart and its applications. In: Proceedings of the Sixth International Conference on Data Mining, ICDM 2006, pp. 613–622. IEEE Computer Society, Washington, DC, USA (2006)Google Scholar
- 21.Xia, J., Caragea, D., Hsu, W.: Bi-relational network analysis using a fast random walk with restart. In: Ninth IEEE International Conference on Data Mining, ICDM 2009, pp. 1052–1057 (2009). 00011Google Scholar
- 23.Zhang, J., Kong, X., Yu, P.S.: Predicting social links for new users across aligned heterogeneous social networks, October 2013. arXiv: arXiv:1310.3492 [physics]