Similarity Modeling on Heterogeneous Networks via Automatic Path Discovery

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11052)


Heterogeneous networks are widely used to model real-world semi-structured data. The key challenge of learning over such networks is the modeling of node similarity under both network structures and contents. To deal with network structures, most existing works assume a given or enumerable set of meta-paths and then leverage them for the computation of meta-path-based proximities or network embeddings. However, expert knowledge for given meta-paths is not always available, and as the length of considered meta-paths increases, the number of possible paths grows exponentially, which makes the path searching process very costly. On the other hand, while there are often rich contents around network nodes, they have hardly been leveraged to further improve similarity modeling. In this work, to properly model node similarity in content-rich heterogeneous networks, we propose to automatically discover useful paths for pairs of nodes under both structural and content information. To this end, we combine continuous reinforcement learning and deep content embedding into a novel semi-supervised joint learning framework. Specifically, the supervised reinforcement learning component explores useful paths between a small set of example similar pairs of nodes, while the unsupervised deep embedding component captures node contents and enables inductive learning on the whole network. The two components are jointly trained in a closed loop to mutually enhance each other. Extensive experiments on three real-world heterogeneous networks demonstrate the supreme advantages of our algorithm. Code related to this paper is available at:


Similarity modeling Heterogeneous networks Deep embedding 



Research was sponsored in part by U.S. Army Research Lab. under Cooperative Agreement No. W911NF-09-2-0053 (NSCTA), DARPA under Agreement No. W911NF-17-C-0099, National Science Foundation IIS 16-18481, IIS 17-04532, and IIS-17-41317, DTRA HDTRA11810026, and grant 1U54GM114838 awarded by NIGMS through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative (

Supplementary material

478890_1_En_3_MOESM1_ESM.pdf (246 kb)
Supplementary material 1 (pdf 246 KB)


  1. 1.
    Bello, I., Pham, H., Le, Q.V., Norouzi, M., Bengio, S.: Neural combinatorial optimization with reinforcement learning. In: ICLR (2017)Google Scholar
  2. 2.
    Das, R., et al.: Go for a walk and arrive at the answer: reasoning over paths in knowledge bases using reinforcement learning. In: ICLR (2018)Google Scholar
  3. 3.
    Dong, Y., Chawla, N.V., Swami, A.: metapath2vec: scalable representation learning for heterogeneous networks. In: KDD, pp. 135–144 (2017)Google Scholar
  4. 4.
    Fang, Y., Lin, W., Zheng, V.W., Wu, M., Chang, K., Li, X.L.: Semantic proximity search on graphs with metagraph-based learning. In: ICDE, pp. 277–288 (2016)Google Scholar
  5. 5.
    Fu, T.Y., Lee, W.C., Lei, Z.: HIN2Vec: explore meta-paths in heterogeneous information networks for representation learning. In: CIKM, pp. 1797–1806 (2017)Google Scholar
  6. 6.
    Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)zbMATHGoogle Scholar
  7. 7.
    Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. TIIS 5(4), 19 (2016)Google Scholar
  8. 8.
    Huang, Z., Mamoulis, N.: Heterogeneous information network embedding for meta path based proximity. arXiv preprint arXiv:1701.05291 (2017)
  9. 9.
    Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: NIPS, pp. 6351–6361 (2017)Google Scholar
  10. 10.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)Google Scholar
  11. 11.
    Le, Q.V.: Building high-level features using large scale unsupervised learning. In: ICASSP, pp. 8595–8598 (2013)Google Scholar
  12. 12.
    Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)
  13. 13.
    Liu, Z., et al.: Semantic proximity search on heterogeneous graph by proximity embedding. In: AAAI, pp. 154–160 (2017)Google Scholar
  14. 14.
    Meng, C., Cheng, R., Maniu, S., Senellart, P., Zhang, W.: Discovering meta-paths in large heterogeneous information networks. In: WWW, pp. 754–764 (2015)Google Scholar
  15. 15.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRefGoogle Scholar
  16. 16.
    Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)Google Scholar
  17. 17.
    Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: ICML, pp. 1889–1897 (2015)Google Scholar
  18. 18.
    Shang, J., Qu, M., Liu, J., Kaplan, L.M., Han, J., Peng, J.: Meta-path guided embedding for similarity search in large-scale heterogeneous information networks. arXiv preprint arXiv:1610.09769 (2016)
  19. 19.
    Shi, Y., Chan, P.W., Zhuang, H., Gui, H., Han, J.: Prep: path-based relevance from a probabilistic perspective in heterogeneous information networks. In: KDD, pp. 425–434 (2017)Google Scholar
  20. 20.
    Shi, Y., Gui, H., Zhu, Q., Kaplan, L., Han, J.: AspEm: embedding learning by aspects in heterogeneous information networks. In: SDM (2018)Google Scholar
  21. 21.
    Sun, Y., Han, J.: Mining heterogeneous information networks: principles and methodologies. Synth. Lect. Data Min. Knowl. Discov. 3(2), 1–159 (2012)CrossRefGoogle Scholar
  22. 22.
    Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. VLDB 4(11), 992–1003 (2011)Google Scholar
  23. 23.
    Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: NIPS, pp. 1057–1063 (2000)Google Scholar
  24. 24.
    Tang, J., Qu, M., Mei, Q.: PTE: predictive text embedding through large-scale heterogeneous text networks. In: KDD, pp. 1165–1174 (2015)Google Scholar
  25. 25.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: KDD, pp. 990–998 (2008)Google Scholar
  26. 26.
    Wan, M., Ouyang, Y., Kaplan, L., Han, J.: Graph regularized meta-path based transductive regression in heterogeneous information network. In: SDM, pp. 918–926 (2015)Google Scholar
  27. 27.
    Wang, C., Song, Y., Li, H., Zhang, M., Han, J.: KnowSim: a document similarity measure on structured heterogeneous information networks. In: ICDM, pp. 1015–1020 (2015)Google Scholar
  28. 28.
    Wang, C., et al.: RelSim: relation similarity search in schema-rich heterogeneous information networks. In: SDM, pp. 621–629 (2016)Google Scholar
  29. 29.
    Xiong, W., Hoang, T., Wang, W.Y.: DeepPath: a reinforcement learning method for knowledge graph reasoning. In: EMNLP (2017)Google Scholar
  30. 30.
    Yang, C., Bai, L., Zhang, C., Yuan, Q., Han, J.: Bridging collaborative filtering and semi-supervised learning: a neural approach for poi recommendation. In: KDD, pp. 1245–1254 (2017)Google Scholar
  31. 31.
    Yang, C., Zhang, C., Chen, X., Ye, J., Han, J.: Did you enjoy the ride: understanding passenger experience via heterogeneous network embedding. In: ICDE (2018)Google Scholar
  32. 32.
    Yang, C., Zhong, L., Li, L.J., Jie, L.: Bi-directional joint inference for user links and attributes on large social graphs. In: WWW, pp. 564–573 (2017)Google Scholar
  33. 33.
    Zhao, H., Yao, Q., Li, J., Song, Y., Lee, D.L.: Meta-graph based recommendation fusion over heterogeneous information networks. In: KDD, pp. 635–644 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations