Advertisement

Exploiting Transitive Similarity and Temporal Dynamics for Similarity Search in Heterogeneous Information Networks

  • Jiazhen He
  • James Bailey
  • Rui Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8422)

Abstract

Heterogeneous information networks have attracted much attention in recent years and a key challenge is to compute the similarity between two objects. In this paper, we study the problem of similarity search in heterogeneous information networks, and extend the meta path-based similarity measure PathSim by incorporating richer information, such as transitive similarity and temporal dynamics. Experiments on a large DBLP network show that our improved similarity measure is more effective at identifying similar authors in terms of their future collaborations.

Keywords

similarity search heterogeneous network meta path 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agichtein, E., Brill, E., Dumais, S.: Improving web search ranking by incorporating user behavior information. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–26. ACM (2006)Google Scholar
  2. 2.
    Al Hasan, M., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In: Proceedings of the Sixth SIAM Data Mining Workshop on Link Analysis, Counter-Terrorism and Security (2006)Google Scholar
  3. 3.
    Balasubramanian, N., Kumaran, G., Carvalho, V.R.: Predicting query performance on the web. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 785–786. ACM (2010)Google Scholar
  4. 4.
    Jeh, G., Widom, J.: Simrank: A measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 538–543. ACM, Edmonton (2002)Google Scholar
  5. 5.
    Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of the 12th International Conference on World Wide Web, WWW 2003. ACM, Budapest (2003)Google Scholar
  6. 6.
    Ji, M., Han, J., Danilevsky, M.: Ranking-based classification of heterogeneous information networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1298–1306. ACM (2011)Google Scholar
  7. 7.
    Kessler, M.M.: Bibliographic coupling between scientific papers. American Documentation 14(1), 10–25 (1963)CrossRefGoogle Scholar
  8. 8.
    Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management, CIKM 2003, pp. 556–559. ACM, New Orleans (2003)CrossRefGoogle Scholar
  9. 9.
    Qin, T., Zhang, X.D., Wang, D.S., Liu, T.Y., Lai, W., Li, H.: Ranking with multiple hyperplanes. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 279–286. ACM (2007)Google Scholar
  10. 10.
    Shi, C., Kong, X., Yu, P.S., Xie, S., Wu, B.: Relevance search in heterogeneous networks. In: Proceedings of the 15th International Conference on Extending Database Technology, EDBT 2012, pp. 180–191. ACM, Berlin (2012)Google Scholar
  11. 11.
    Small, H.: Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for information Science 24(4), 265–269 (1973)CrossRefGoogle Scholar
  12. 12.
    Sun, Y., Barber, R., Gupta, M., Aggarwal, C.C., Han, J.: Co-author relationship prediction in heterogeneous bibliographic networks. In: Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2011, IEEE Computer Society, Washington, DC (2011)Google Scholar
  13. 13.
    Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen?: relationship prediction in heterogeneous information networks. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM 2012, pp. 663–672. ACM, Seattle (2012)Google Scholar
  14. 14.
    Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment 4(11) (2011)Google Scholar
  15. 15.
    Sun, Y., Tang, J., Han, J., Gupta, M., Zhao, B.: Community evolution detection in dynamic heterogeneous information networks. In: Proceedings of the Eighth Workshop on Mining and Learning with Graphs, pp. 137–146. ACM (2010)Google Scholar
  16. 16.
    Sun, Y., Yu, Y., Han, J.: Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 797–806. ACM (2009)Google Scholar
  17. 17.
    Yeh, J.Y., Lin, J.Y., Ke, H.R., Yang, W.P.: Learning to rank for information retrieval using genetic programming. In: Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Retrieval, LR4IR 2007 (2007)Google Scholar
  18. 18.
    Yu, X., Gu, Q., Zhou, M., Han, J.: Citation prediction in heterogeneous bibliographic networks. In: Proceedings of the Twelfth SIAM International Conference on Data Mining, Anaheim, California, USA, pp. 1119–1130 (2012)Google Scholar
  19. 19.
    Yu, X., Sun, Y., Norick, B., Mao, T., Han, J.: User guided entity similarity search using meta-path selection in heterogeneous information networks. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 2025–2029. ACM, Maui (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jiazhen He
    • 1
    • 2
  • James Bailey
    • 1
    • 2
  • Rui Zhang
    • 1
  1. 1.Department of Computing and Information SystemsThe University of MelbourneAustralia
  2. 2.Victoria Research LaboratoryNational ICT AustraliaAustralia

Personalised recommendations