Abstract
To what extent can changes in PageRank’s damping factor affect node ranking? We prove that, at least on some graphs, the top k nodes assume all possible k! orderings as the damping factor varies, even if it varies within an arbitrarily small interval (e.g. [0.84999, 0.85001]). Thus, the rank of a node for a given (finite set of discrete) damping factor(s) provides very little information about the rank of that node as the damping factor varies over a continuous interval.
We bypass this problem introducing lineage analysis and proving that there is a simple condition, with a “natural” interpretation independent of PageRank, that allows one to verify “in one shot” if a node outperforms another simultaneously for all damping factors and all damping variables (informally, time variant damping factors). The novel notions of strong rank and weak rank of a node provide a measure of the fuzziness of the rank of that node, of the objective orderability of a graph’s nodes, and of the quality of results returned by different ranking algorithms based on the random surfer model.
We deploy our analytical tools on a 41M node snapshot of the .it Web domain and on a 0.7M node snapshot of the CiteSeer citation graph. Among other findings, we show that rank is indeed relatively stable in both graphs; that “classic” PageRank (d = 0.85) marginally outperforms Weighted In-degree (d→0), mainly due to its ability to ferret out “niche” items; and that, for both the Web and CiteSeer, the ideal damping factor appears to be 0.8 − 0.9 to obtain those items of high importance to at least one (model of randomly surfing) user, but only 0.5 − 0.6 to obtain those items important to every (model of randomly surfing) user.
This work was supported in part by MIUR under PRIN Mainstream and by EU under Integr. Proj. AEOLUS (IP-FP6-015964).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
CiteSeer metadata, http://citeseer.ist.psu.edu/oai.html
Avrachenkov, K., Litvak, N., Son Pham, K.: A singular perturbation approach for choosing PageRank damping factor. ArXiv Mathematics e-prints (2006)
Bacchin, M., Ferro, N., Melucci, M.: The effectiveness of a graph-based algorithm for stemming. In: Lim, E.-p., Foo, S.S.-B., Khoo, C., Chen, H., Fox, E., Urs, S.R., Costantino, T. (eds.) ICADL 2002. LNCS, vol. 2555, pp. 117–128. Springer, Heidelberg (2002)
Baeza-Yates, R., Boldi, P., Castillo, C.: Generalizing PageRank: Damping functions for link-based ranking algorithms. In: Proc. ACM SIGIR 2006 (2006)
Berry, M.W.: Survey of Text Mining. Springer, Heidelberg (2003)
Boldi, P., Santini, M., Vigna, S.: PageRank as a function of the damping factor. In: Proc. ACM WWW 2005 (2005)
Boldi, P., Vigna, S.: The WebGraph framework I: Compression techniques. In: Proc. of the Thirteenth International World Wide Web Conference (WWW 2004), Manhattan, USA, pp. 595–601. ACM Press, New York (2004)
Cho, J., García-Molina, H., Page, L.: Efficient crawling through URL ordering. Computer Networks and ISDN Systems 30(1-7), 161–172 (1998)
Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proc. ACM SODA (2003)
Haveliwala, T.H.: Efficient computation of pagerank. Technical report (1999)
Jiang, X.M., Xue, G.R., Zeng, H.J., Chen, Z., Song, W.-G., Ma, W.-Y.: Exploiting pageRank at different block level. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds.) WISE 2004. LNCS, vol. 3306, pp. 241–252. Springer, Heidelberg (2004)
Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating pagerank computations. In: Proceedings of WWW, pp. 261–270. ACM, New York (2003)
Kamvar, S.D., Schlosser, M.T., Garcia-Molina, H.: The eigentrust algorithm for reputation management in p2p networks. In: Proc. of ACM WWW 2003(2003)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Math. 1(3), 335–380 (2004)
Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, Princeton (2006)
Melucci, M., Pretto, L.: PageRank: When order changes. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 581–588. Springer, Heidelberg (2007)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Dig. Libr. Tech. Proj. (1998)
Peserico, E., Bressan, M.: Choose the Damping, Choose the Ranking? Technical report, Univ. Padova (2008), http://www.dei.unipd.it/~enoch/papers/damprank.pdf
Peserico, E., Pretto, L.: What does it mean to converge in rank? In: Proc. ICTIR 2007 (2007)
Pretto, L.: A theoretical analysis of google’s PageRank. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 131–144. Springer, Heidelberg (2002)
Chakrabarti, S., Dom, B.E., Gibson, D., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Experiments in topic distillation. In: Proc. ACM SIGIR Workshop on Hypertext IR on the Web (1998)
Tarau, P., Mihalcea, R., Figa, E.: Semantic document engineering with wordnet and PageRank. In: Proc. ACM SAC 2005 (2005)
University of Milan. Laboratory of Web Algorithmics, http://law.dsi.unimi.it/
Wangk, K.W.: Item selection by ’hub-authority’ profit ranking. In: Proc. ACM SIGKDD 2002 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bressan, M., Peserico, E. (2009). Choose the Damping, Choose the Ranking?. In: Avrachenkov, K., Donato, D., Litvak, N. (eds) Algorithms and Models for the Web-Graph. WAW 2009. Lecture Notes in Computer Science, vol 5427. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-95995-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-95995-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-95994-6
Online ISBN: 978-3-540-95995-3
eBook Packages: Computer ScienceComputer Science (R0)