Using Polynomial Chaos to Compute the Influence of Multiple Random Surfers in the PageRank Model

  • Paul G. Constantine
  • David F. Gleich
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4863)

Abstract

The PageRank equation computes the importance of pages in a web graph relative to a single random surfer with a constant teleportation coefficient. To be globally relevant, the teleportation coefficient should account for the influence of all users. Therefore, we correct the PageRank formulation by modeling the teleportation coefficient as a random variable distributed according to user behavior. With this correction, the PageRank values themselves become random. We present two methods to quantify the uncertainty in the random PageRank: a Monte Carlo sampling algorithm and an algorithm based the truncated polynomial chaos expansion of the random quantities. With each of these methods, we compute the expectation and standard deviation of the PageRanks. Our statistical analysis shows that the standard deviation of the PageRanks are uncorrelated with the PageRank vector.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford University (1999)Google Scholar
  2. 2.
    Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of the 12th international conference on the World Wide Web, Budapest, Hungary, pp. 271–279. ACM, New York (2003)Google Scholar
  3. 3.
    Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Proceedings of the 12th international conference on the World Wide Web, pp. 261–270. ACM Press, New York (2003)Google Scholar
  4. 4.
    Arasu, A., Novak, J., Tomkins, A., Tomlin, J.: PageRank computation and the structure of the web: Experiments and algorithms. In: Proceedings of the 11th international conference on the World Wide Web (2002)Google Scholar
  5. 5.
    Del Corso, G.M., Gullí, A., Romani, F.: Fast PageRank computation via a sparse linear system. Internet Mathematics 2(3), 251–273 (2005)MATHMathSciNetGoogle Scholar
  6. 6.
    Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press (2006)Google Scholar
  7. 7.
    Avrachenkov, K., Litvak, N., Pham, K.S.: A singular perturbation approach for choosing PageRank damping factor. arXiv e-prints (2006)Google Scholar
  8. 8.
    Baeza-Yates, R., Boldi, P., Castillo, C.: Generalizing PageRank: Damping functions for link-based ranking algorithms. In: Proceedings of ACM SIGIR, Seattle, Washington, USA, pp. 308–315. ACM Press, New York (2006)Google Scholar
  9. 9.
    Rice, J.A.: Mathematical Statistics and Data Analysis, 2nd edn. Duxbury Press, Boston (1995)MATHGoogle Scholar
  10. 10.
    Ripley, B.D.: Stochastic Simulation, 1st edn. Wiley, Chichester (1987)MATHGoogle Scholar
  11. 11.
    Wiener, N.: The homogeneous chaos. American Journal of Mathematics 60, 897–936 (1938)MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Ghanem, R.G., Spanos, P.D.: Stochastic Finite Elements: A Spectral Approach, 1st edn. Springer, New York (1991)MATHGoogle Scholar
  13. 13.
    Xiu, D., Karniadakis, G.E.: The Wiener–Askey polynomial chaos for stochastic differential equations. SIAM J. Sci. Comput. 24(2), 619–644 (2002)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Babus̆ka, I., Tempone, R., Zouraris, G.: Galerkin finite element approximations of stochastic elliptic differential equations. SIAM Journal of Numeical Analysis 42(2), 800–825 (2004)Google Scholar
  15. 15.
    Wan, X., Karniadakis, G.E.: An adaptive multi-element generalized polynomial chaos method for stochastic differential equations. Journal of Computational Physics 209, 617–642 (2005)MATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Maître, O.P.L., Knio, O.M., Najm, H.N., Ghanem, R.G.: A stochastic projection method for fluid flow. Journal of Computational Physics 173, 481–511 (2001)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Cameron, R.H., Martin, W.T.: The orthogonal development of non-linear functionals in series of fourier-hermite functionals. The Annals of Mathematics 48, 385–392 (1947)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Hirai, J., Raghavan, S., Garcia-Molina, H., Paepcke, A.: WebBase: a repository of web pages. Computer Networks 33(1-6), 277–293 (2000)CrossRefGoogle Scholar
  19. 19.
    Boldi, P., Codenotti, B., Santini, M., Vigna, S.: UbiCrawler: A scalable fully distributed web crawler. Software: Practice & Experience 34(8), 711–726 (2004)CrossRefGoogle Scholar
  20. 20.
    Thelwall, M.: A free database of university web links: Data collection issues. International Journal of Scientometrics, Informetrics and Bibliometrics 6/7(1) (2003)Google Scholar
  21. 21.
    Various: Wikipedia XML database dump from November 5, 2005 (November 2005), Accessed from http://en.wikipedia.org/wiki/Wikipedia:Database_download
  22. 22.
    Boldi, P., Vigna, S.: Codes for the world wide web. Internet Mathematics 2, 407–429 (2005)MATHMathSciNetGoogle Scholar
  23. 23.
    Davis, T.: University of Florida sparse matrix collection. NA Digest, vol. 92(42), October 16, 1994, NA Digest, vol. 96(28), July 23, 1996, and NA Digest, vol. 97(23), June 7, 1997 (2007), http://www.cise.ufl.edu/research/sparse/matrices/
  24. 24.
    Golub, G.H., van Loan, C.F.: Matrix Computations (Johns Hopkins Studies in Mathematical Sciences). The Johns Hopkins University Press, Baltimore (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Paul G. Constantine
    • 1
  • David F. Gleich
    • 1
  1. 1.Stanford University, Institute for Computational and Mathematical Engineering 

Personalised recommendations