Link Prediction in Schema-Rich Heterogeneous Information Network

  • Xiaohuan Cao
  • Yuyan Zheng
  • Chuan ShiEmail author
  • Jingzhi Li
  • Bin Wu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9651)


Recent years have witnessed the boom of heterogeneous information network (HIN), which contains different types of nodes and relations. Many data mining tasks have been explored in this kind of network. Among them, link prediction is an important task to predict the potential links among nodes, which are required in many applications. The contemporary link prediction usually are based on simple HIN whose schema are bipartite or star-schema. In these HINs, the meta paths are predefined or can be enumerated. However, in many real networked data, it is hard to describe their network structure with simple schema. For example, the knowledge base with RDF format include tens of thousands types of objects and links. On this kind of schema-rich HIN, it is impossible to enumerate meta paths. In this paper, we study the link prediction in schema-rich HIN and propose a novel Link Prediction with automatic meta Paths method (LiPaP). The LiPaP designs an algorithm called Automatic Meta Path Generation (AMPG) to automatically extract meta paths from schema-rich HIN and a supervised method with likelihood function to learn weights of the extracted meta paths. Experiments on real knowledge database, Yago, validate that LiPaP is an effective, steady and efficient method.


Heterogeneous Information Network Link prediction Similarity measure Meta path 



This work is supported in part by National Key Basic Research and Department (973) Program of China (No. 2013CB329606), and the National Natural Science Foundation of China (No. 71231002, 61375058,11571161), and the CCF-Tencent Open Fund, the Co-construction Project of Beijing Municipal Commission of Education, and Shenzhen Sci.-Tech Fund No. JCYJ20140509143748226.


  1. 1.
  2. 2.
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia-a crystallization point for the web of data. Web Semant.: Sci. Serv. Agents World Wide Web 7(3), 154–165 (2009)CrossRefGoogle Scholar
  3. 3.
    Cao, B., Kong, X., Yu, P.S.: Collective prediction of multiple types of links in heterogeneous information networks. In: ICDM, pp. 50–59 (2014)Google Scholar
  4. 4.
    Deng, H., Lyu, M.R., King, I.: A generalized co-hits algorithm and its application to bipartite graphs. In: KDD, pp. 239–248 (2009)Google Scholar
  5. 5.
    Jaiwei, H.: Mining heterogeneous information networks: the next frontier. In: SIGKDD, pp. 2–3 (2012)Google Scholar
  6. 6.
    Jamali, M., Lakshmanan, L.: HeteroMF: recommendation in heterogeneous information networks using context dependent factor models. In: WWW, pp. 643–654 (2013)Google Scholar
  7. 7.
    Kong, X., Yu, P.S., Ding, Y., Wild, D.J.: Meta path-based collective classification in heterogeneous information networks. In: CIKM, pp. 1567–1571 (2012)Google Scholar
  8. 8.
    Lao, N., Cohen, W.W.: Relational retrieval using a combination of path-constrained random walks. Mach. Learn. 81(1), 53–67 (2010)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Shi, C., Kong, X., Yu, P.S., Xie, S., Wu, B.: Relevance search in heterogeneous networks. In: EDBT, pp. 180–191 (2012)Google Scholar
  10. 10.
    Singhal, A.: Introducing the knowledge graph: things, not strings. Official Google Blog (2012)Google Scholar
  11. 11.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW, pp. 697–706 (2007)Google Scholar
  12. 12.
    Sun, Y., Barber, R., Gupta, M., Aggarwal, C.C., Han, J.: Co-author relationship prediction in heterogeneous bibliographic networks. In: ASONAM, pp. 121–128 (2011)Google Scholar
  13. 13.
    Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen?: relationship prediction in heterogeneous information networks. In: WSDM, pp. 663–672 (2012)Google Scholar
  14. 14.
    Sun, Y., Norick, B., Han, J., Yan, X., Yu, P.S., Yu, X.: Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. In: KDD, pp. 1348–1356 (2012)Google Scholar
  15. 15.
    Yu, X., Gu, Q., Zhou, M., Han, J.: Citation prediction in heterogeneous bibliographic networks. In: SDM, pp. 1119–1130 (2012)Google Scholar
  16. 16.
    Zha, H., He, X., Ding, C.H.Q., Gu, M., Simon, H.D.: Bipartite graph partitioning and data clustering (2001). CoRR cs.IR/0108018Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Xiaohuan Cao
    • 1
  • Yuyan Zheng
    • 1
  • Chuan Shi
    • 1
    Email author
  • Jingzhi Li
    • 2
  • Bin Wu
    • 1
  1. 1.Beijing Key Lab of Intelligent Telecommunications Software and MultimediaBeijing University of Posts and TelecommunicationsBeijingChina
  2. 2.Department of MathematicsSouthern University of Science and TechnologyShenzhenChina

Personalised recommendations