Applied Intelligence

, Volume 48, Issue 5, pp 1111–1127 | Cite as

Supervised ranking framework for relationship prediction in heterogeneous information networks

  • Wenxin Liang
  • Xiao Li
  • Xiaosong He
  • Xinyue Liu
  • Xianchao Zhang
Article

Abstract

In recent years, relationship prediction in heterogeneous information networks (HINs) has become an active topic. The most essential part of this task is how to effectively represent and utilize the important three kinds of information hidden in connections of the network, namely local structure information (Local-info), global structure information (Global-info) and attribute information (Attr-info). Although all the information indicates different features of the network and influence relationship creation in a complementary way, existing approaches utilize them separately or in a partially combined way. In this article, a novel framework named Supervised Ranking framework (S-Rank) is proposed to tackle this issue. To avoid the class imbalance problem, in S-Rank framework we treat the relationship prediction problem as a ranking task and divide it into three phases. Firstly, a Supervised PageRank strategy (SPR) is proposed to rank the candidate nodes according to Global-info and Attr-info. Secondly, a Meta Path-based Ranking method (MPR) utilizing Local-info is proposed to rank the candidate nodes based on their meta path-based features. Finally, the two ranking scores are linearly integrated into the final ranking result which combines all the Attr-info, Global-info and Local-info together. Experiments on DBLP data demonstrate that the proposed S-Rank framework can effectively take advantage of all the three kinds of information for relationship prediction over HINs and outperforms other well-known baseline approaches.

Keywords

Relationship prediction Ranking strategy Meta path Heterogeneous information networks 

Notes

Acknowledgments

This work was partially supported by National High Technology Research and Development Program (863 Program) of China (No. 2015AA015403) and National Science Foundation of China (No. 61632019).

References

  1. 1.
    Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: The fourth ACM international conference on Web search and data mining. ACM, pp 635–644Google Scholar
  2. 2.
    Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: The eleventh annual conference on computational learning theory. ACM, pp 92–100Google Scholar
  3. 3.
    Cao B, Kong X, Yu P S (2014) Collective prediction of multiple types of links in heterogeneous information networks. In: ICDM, pp 50–59Google Scholar
  4. 4.
    Cao X, Zheng Y, Shi C, Li J, Wu B (2016) Link prediction in schema-rich heterogeneous information network. In: Advances in knowledge discovery and data mining - 20th Pacific-Asia conference, PAKDD 2016, Auckland, New Zealand, April 19-22, 2016, Proceedings, Part I, pp 449–460Google Scholar
  5. 5.
    Deng Z H, Lai B Y, Wang Z H, Fang G D (2012) Pav: a novel model for ranking heterogeneous objects in bibliographic information networks. Expert Syst Appl 39(10):9788–9796CrossRefGoogle Scholar
  6. 6.
    Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATHGoogle Scholar
  7. 7.
    Gao B, Liu T, Wei W, Wang T, Li H (2011) Semi-supervised ranking on very large graphs with rich metadata. In: SIGKDD, pp 96–104Google Scholar
  8. 8.
    Han J (2012) Mining heterogeneous information networks: the next frontier. In: SIGKDD. ACM, pp 2–3Google Scholar
  9. 9.
    Hand D J, Till R J (2001) A simple generalisation of the area under the roc curve for multiple class classification problems, pp 171–186Google Scholar
  10. 10.
    He J, Bailey J, Zhang R (2014) Exploiting transitive similarity and temporal dynamics for similarity search in heterogeneous information networks. In: DASFAA, pp 141–155Google Scholar
  11. 11.
    Kautz H, Selman B, Shah M (1997) Referral web: combining social networks and collaborative filtering. Commun ACM 40(3):63–65CrossRefGoogle Scholar
  12. 12.
    Kong X, Yu P S, Ding Y, Wild D J (2012) Meta path-based collective classification in heterogeneous information networks. In: The 21st ACM international conference on information and knowledge management. ACM, pp 1567–1571Google Scholar
  13. 13.
    Lee J B, Adorna H (2012) Link prediction in a modified heterogeneous bibliographic network. In: ASONAM. IEEE, pp 442– 449Google Scholar
  14. 14.
    Liang W, He X, Tang D, Zhang X (2016) S-rank: a supervised ranking framework for relationship prediction in heterogeneous information networks. Lecture notes in computer science, vol 9799. Springer, pp 305–319Google Scholar
  15. 15.
    Liben-Nowell D, Kleinberg JM (2003) The link prediction problem for social networks. In: Proceedings of the 2003 ACM CIKM international conference on information and knowledge management. New Orleans, pp 556–559Google Scholar
  16. 16.
    Ma Y, Yang N, Li C, Zhang L, Yu P S (2015) Predicting neighbor distribution in heterogeneous information networks. In: Proceedings of the 2015 SIAM international conference on data mining. Vancouver, pp 784–791Google Scholar
  17. 17.
    Ma Z, Dai Q (2016) Selected an stacking elms for time series prediction. Neural Process Lett 44:831–856CrossRefGoogle Scholar
  18. 18.
    Ma Z, Dai Q, Liu N (2015) Several novel evaluation measures for rank-based ensemble pruning with applications to time series prediction. Expert Syst Appl 42:280–292CrossRefGoogle Scholar
  19. 19.
    Menon A K, Elkan C (2011) Link prediction via matrix factorization. In: ECML/PKDD (2), pp 437–452Google Scholar
  20. 20.
    Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the webGoogle Scholar
  21. 21.
    Rajkumar A, Agarwal S (2014) A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In: ICML, pp 118–126Google Scholar
  22. 22.
    Shen W, Han J, Wang J (2014) A probabilistic model for linking named entities in web text with heterogeneous information networks. In: SIGMOD, pp 1199–1210Google Scholar
  23. 23.
    Shi B, Weninger T (2016) Fact checking in heterogeneous information networks. In: Proceedings of the 25th international conference on World Wide Web, WWW 2016, Montreal, Canada, April 11-15, 2016, Companion Volume, pp 101–102Google Scholar
  24. 24.
    Shi C, Zhang Z, Luo P, Yu P S, Yue Y, Wu B (2015) Semantic path based personalized recommendation on weighted heterogeneous information networks. In: Proceedings of the 24th ACM international on conference on information and knowledge management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015, pp 453–462Google Scholar
  25. 25.
    Sun Y, Barber R, Gupta M, Aggarwal C C, Han J (2011) Co-author relationship prediction in heterogeneous bibliographic networks. In: ASONAM. IEEE, pp 121–128Google Scholar
  26. 26.
    Sun Y, Han J, Yan X, Yu P S, Wu T (2011) Pathsim: meta path-based top-k similarity search in heterogeneous information networks. PVLDB 4(11):992–1003Google Scholar
  27. 27.
    Sun Y, Han J, Aggarwal C C, Chawla N V (2012) When will it happen? Relationship prediction in heterogeneous information networks. In: WSDM, pp. 663–672Google Scholar
  28. 28.
    Tang J, Lou T, Kleinberg J (2012) Inferring social ties across heterogenous networks. In: The fifth ACM international conference on Web search and data mining. ACM, pp 743–752Google Scholar
  29. 29.
    Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: SIGKDD, pp 990–998Google Scholar
  30. 30.
    Tang W, Zhuang H, Tang J (2011) Learning to infer social ties in large networks. In: Machine learning and knowledge discovery in databases - European conference, ECML PKDD 2011, Athens, Greece, September 5-9, 2011, Proceedings, Part III, pp 381–397Google Scholar
  31. 31.
    Wang C, Song Y, Li H, Zhang M, Han J (2016) Text classification with heterogeneous information network kernels. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. Phoenix, pp 2130–2136Google Scholar
  32. 32.
    Yan L, Dodier R H, Mozer M, Wolniewicz R H (2003) Optimizing classifier performance via an approximation to the wilcoxon-mann-whitney statistic. In: ICML, pp 848–855Google Scholar
  33. 33.
    Yin Z, Gupta M, Weninger T, Han J (2010) A unified framework for link recommendation using random walks. In: ASONAM. IEEE, pp 152–159Google Scholar
  34. 34.
    Yu X, Gu Q, Zhou M, Han J (2012) Citation prediction in heterogeneous bibliographic networks. In: SDM. SIAM, pp 1119–1130Google Scholar
  35. 35.
    Yu X, Ren X, Sun Y, Gu Q, Sturt B, Khandelwal U, Norick B, Han J (2014) Personalized entity recommendation: a heterogeneous information network approach. In: Seventh ACM international conference on web search and data mining, WSDM 2014. New York, pp 283–292Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.School of SoftwareDalian University of TechnologyDalianChina
  2. 2.A Bit AI Co., Ltd, Danleng SOHOBeijingChina

Personalised recommendations