Information Retrieval

, Volume 8, Issue 2, pp 245–264

Rank-Stability and Rank-Similarity of Link-Based Web Ranking Algorithms in Authority-Connected Graphs

Authors

  • R. Lempel
    • IBM Research Lab
    • Department of Computer ScienceTechnion
Article

DOI: 10.1007/s10791-005-5661-0

Cite this article as:
Lempel, R. & Moran, S. Inf Retrieval (2005) 8: 245. doi:10.1007/s10791-005-5661-0

Abstract

Web search algorithms that rank Web pages by examining the link structure of the Web are attractive from both theoretical and practical aspects. Today’s prevailing link-based ranking algorithms rank Web pages by using the dominant eigenvector of certain matrices—like the co-citation matrix or variations thereof. Recent analyses of ranking algorithms have focused attention on the case where the corresponding matrices are irreducible, thus avoiding singularities of reducible matrices. Consequently, rank analysis has been concentrated on authority connected graphs, which are graphs whose co-citation matrix is irreducible (after deleting zero rows and columns). Such graphs conceptually correspond to thematically related collections, in which most pages pertain to a single, dominant topic of interest.

A link-based search algorithm A is rank-stable if minor changes in the link structure of the input graph, which is usually a subgraph of the Web, do not affect the ranking it produces; algorithms A,B are rank-similar if they produce similar rankings. These concepts were introduced and studied recently for various existing search algorithms.

This paper studies the rank-stability and rank-similarity of three link-based ranking algorithms—PageRank, HITS and SALSA—in authority connected graphs. For this class of graphs, we show that neither HITS nor PageRank is rank stable. We then show that HITS and PageRank are not rank similar on this class, nor is any of them rank similar to SALSA.

Keywords

Web IRcitationlink analysis

Copyright information

© Springer Science + Business Media, Inc. 2005