Choose the Damping, Choose the Ranking?

  • Marco Bressan
  • Enoch Peserico
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5427)

Abstract

To what extent can changes in PageRank’s damping factor affect node ranking? We prove that, at least on some graphs, the top k nodes assume all possible k! orderings as the damping factor varies, even if it varies within an arbitrarily small interval (e.g. [0.84999, 0.85001]). Thus, the rank of a node for a given (finite set of discrete) damping factor(s) provides very little information about the rank of that node as the damping factor varies over a continuous interval.

We bypass this problem introducing lineage analysis and proving that there is a simple condition, with a “natural” interpretation independent of PageRank, that allows one to verify “in one shot” if a node outperforms another simultaneously for all damping factors and all damping variables (informally, time variant damping factors). The novel notions of strong rank and weak rank of a node provide a measure of the fuzziness of the rank of that node, of the objective orderability of a graph’s nodes, and of the quality of results returned by different ranking algorithms based on the random surfer model.

We deploy our analytical tools on a 41M node snapshot of the .it Web domain and on a 0.7M node snapshot of the CiteSeer citation graph. Among other findings, we show that rank is indeed relatively stable in both graphs; that “classic” PageRank (d = 0.85) marginally outperforms Weighted In-degree (d→0), mainly due to its ability to ferret out “niche” items; and that, for both the Web and CiteSeer, the ideal damping factor appears to be 0.8 − 0.9 to obtain those items of high importance to at least one (model of randomly surfing) user, but only 0.5 − 0.6 to obtain those items important to every (model of randomly surfing) user.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Avrachenkov, K., Litvak, N., Son Pham, K.: A singular perturbation approach for choosing PageRank damping factor. ArXiv Mathematics e-prints (2006)Google Scholar
  3. 3.
    Bacchin, M., Ferro, N., Melucci, M.: The effectiveness of a graph-based algorithm for stemming. In: Lim, E.-p., Foo, S.S.-B., Khoo, C., Chen, H., Fox, E., Urs, S.R., Costantino, T. (eds.) ICADL 2002. LNCS, vol. 2555, pp. 117–128. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Baeza-Yates, R., Boldi, P., Castillo, C.: Generalizing PageRank: Damping functions for link-based ranking algorithms. In: Proc. ACM SIGIR 2006 (2006)Google Scholar
  5. 5.
    Berry, M.W.: Survey of Text Mining. Springer, Heidelberg (2003)Google Scholar
  6. 6.
    Boldi, P., Santini, M., Vigna, S.: PageRank as a function of the damping factor. In: Proc. ACM WWW 2005 (2005)Google Scholar
  7. 7.
    Boldi, P., Vigna, S.: The WebGraph framework I: Compression techniques. In: Proc. of the Thirteenth International World Wide Web Conference (WWW 2004), Manhattan, USA, pp. 595–601. ACM Press, New York (2004)Google Scholar
  8. 8.
    Cho, J., García-Molina, H., Page, L.: Efficient crawling through URL ordering. Computer Networks and ISDN Systems 30(1-7), 161–172 (1998)CrossRefGoogle Scholar
  9. 9.
    Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)Google Scholar
  10. 10.
    Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proc. ACM SODA (2003)Google Scholar
  11. 11.
    Haveliwala, T.H.: Efficient computation of pagerank. Technical report (1999)Google Scholar
  12. 12.
    Jiang, X.M., Xue, G.R., Zeng, H.J., Chen, Z., Song, W.-G., Ma, W.-Y.: Exploiting pageRank at different block level. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds.) WISE 2004. LNCS, vol. 3306, pp. 241–252. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating pagerank computations. In: Proceedings of WWW, pp. 261–270. ACM, New York (2003)Google Scholar
  14. 14.
    Kamvar, S.D., Schlosser, M.T., Garcia-Molina, H.: The eigentrust algorithm for reputation management in p2p networks. In: Proc. of ACM WWW 2003(2003)Google Scholar
  15. 15.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Math. 1(3), 335–380 (2004)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, Princeton (2006)MATHGoogle Scholar
  18. 18.
    Melucci, M., Pretto, L.: PageRank: When order changes. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 581–588. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  19. 19.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Dig. Libr. Tech. Proj. (1998)Google Scholar
  20. 20.
    Peserico, E., Bressan, M.: Choose the Damping, Choose the Ranking? Technical report, Univ. Padova (2008), http://www.dei.unipd.it/~enoch/papers/damprank.pdf
  21. 21.
    Peserico, E., Pretto, L.: What does it mean to converge in rank? In: Proc. ICTIR 2007 (2007)Google Scholar
  22. 22.
    Pretto, L.: A theoretical analysis of google’s PageRank. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 131–144. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  23. 23.
    Chakrabarti, S., Dom, B.E., Gibson, D., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Experiments in topic distillation. In: Proc. ACM SIGIR Workshop on Hypertext IR on the Web (1998)Google Scholar
  24. 24.
    Tarau, P., Mihalcea, R., Figa, E.: Semantic document engineering with wordnet and PageRank. In: Proc. ACM SAC 2005 (2005)Google Scholar
  25. 25.
    University of Milan. Laboratory of Web Algorithmics, http://law.dsi.unimi.it/
  26. 26.
    Wangk, K.W.: Item selection by ’hub-authority’ profit ranking. In: Proc. ACM SIGKDD 2002 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Marco Bressan
    • 1
  • Enoch Peserico
    • 1
  1. 1.Dipartimento di Ingegneria dell’InformazioneUniversità di PadovaItaly

Personalised recommendations