Knowledge and Information Systems

, Volume 11, Issue 1, pp 105–129

Node similarity in the citation graph

  • Wangzhong Lu
  • J. Janssen
  • E. Milios
  • N. Japkowicz
  • Yongzheng Zhang
Regular Paper

DOI: 10.1007/s10115-006-0023-9

Cite this article as:
Lu, W., Janssen, J., Milios, E. et al. Knowl Inf Syst (2007) 11: 105. doi:10.1007/s10115-006-0023-9

Abstract

Published scientific articles are linked together into a graph, the citation graph, through their citations. This paper explores the notion of similarity based on connectivity alone, and proposes several algorithms to quantify it. Our metrics take advantage of the local neighborhoods of the nodes in the citation graph. Two variants of link-based similarity estimation between two nodes are described, one based on the separate local neighborhoods of the nodes, and another based on the joint local neighborhood expanded from both nodes at the same time. The algorithms are implemented and evaluated on a subgraph of the citation graph of computer science in a retrieval context. The results are compared with text-based similarity, and demonstrate the complementarity of link-based and text-based retrieval.

Keywords

Networked information spaces Document similarity metric Citation graph Digital libraries 

Copyright information

© Springer-Verlag London Limited 2006

Authors and Affiliations

  • Wangzhong Lu
    • 1
  • J. Janssen
    • 2
  • E. Milios
    • 1
  • N. Japkowicz
    • 3
  • Yongzheng Zhang
    • 1
  1. 1.Faculty of Computer Science, Dalhousie UniversityHalifaxCanada
  2. 2.Department of Mathematics and StatisticsDalhousie UniversityHalifaxCanada
  3. 3.School of Information Technology and EngineeringUniversity of OttawaOttawaCanada

Personalised recommendations