Skip to main content
Log in

On efficient randomized algorithms for finding the PageRank vector

  • Published:
Computational Mathematics and Mathematical Physics Aims and scope Submit manuscript

Abstract

Two randomized methods are considered for finding the PageRank vector; in other words, the solution of the system p T = p T P with a stochastic n × n matrix P, where n ∼ 107–109, is sought (in the class of probability distributions) with accuracy ɛ: ɛ ≫ n −1. Thus, the possibility of brute-force multiplication of P by the column is ruled out in the case of dense objects. The first method is based on the idea of Markov chain Monte Carlo algorithms. This approach is efficient when the iterative process p T t+1 = p T t P quickly reaches a steady state. Additionally, it takes into account another specific feature of P, namely, the nonzero off-diagonal elements of P are equal in rows (this property is used to organize a random walk over the graph with the matrix P). Based on modern concentration-of-measure inequalities, new bounds for the running time of this method are presented that take into account the specific features of P. In the second method, the search for a ranking vector is reduced to finding the equilibrium in the antagonistic matrix game

$$\mathop {\min }\limits_{p \in S_n (1)} \mathop {\max }\limits_{u \in S_n (1)} \left\langle {u,\left( {P^T - I} \right)p} \right\rangle ,$$

where S n (1) is a unit simplex in ℝn and I is the identity matrix. The arising problem is solved by applying a slightly modified Grigoriadis-Khachiyan algorithm (1995). This technique, like the Nazin-Polyak method (2009), is a randomized version of Nemirovski’s mirror descent method. The difference is that randomization in the Grigoriadis-Khachiyan algorithm is used when the gradient is projected onto the simplex rather than when the stochastic gradient is computed. For sparse matrices P, the method proposed yields noticeably better results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. Brin and L. Page, “The anatomy of a large-scale hypertextual web search engine,” Comput. Network ISDN Syst. 30(1–7), 107–117 (1998).

    Article  Google Scholar 

  2. A. N. Langville and C. D. Meyer, Google’s PageRank and Beyond: The Science of Search Engine Rankings (Princeton Univ. Press, New York, 2006).

    Google Scholar 

  3. P. Baldi, P. Frasconi, and P. Smyth, Modeling the Internet and the Web: Probabilistic Methods and Algorithms (Wiley, New York, 2003); http://ibook.ics.uci.edu/.

    Google Scholar 

  4. M. Franceschet, “PageRank: Standing on the shoulders of giants,” Commun. ACM 54(6), 92–101 (2011); arXiv:1002.2858v3.

    Article  Google Scholar 

  5. A. V. Gasnikov, E. V. Gasnikova, and O. S. Fed’ko, “On possible dynamics in Google’s PageRank model and a modified model of computation correspondence matrix,” Tr. Mosk. Fiz.-Tekh. Inst. 4(2(14)), 101–120 (2012).

    Google Scholar 

  6. R. P. Agaev and P. Yu. Chebotarev, “Convergence and stability in characteristics-matching problems (overview of basic results),” Upr. Bol’sh. Sist. 30(1), 470–505 (2010).

    Google Scholar 

  7. D. A. Levin, Y. Peres, and E. L. Wilmer, Markov Chain and Mixing Times (AMS, 2009); http://pages.uoregon.edu/dlevin/MARKOV/markovmixing.pdf.

    Google Scholar 

  8. A. V. Nazin and B. T. Polyak, “Randomized algorithm to determine the eigenvector of a stochastic matrix with application to the PageRank problem,” Autom. Remote Control 72(2), 342–352 (2011).

    Article  MATH  MathSciNet  Google Scholar 

  9. Y. E. Nesterov, “Subgradient methods for huge-scale optimization problems,” CORE Discussion Paper; 2012/2, 2012; http://dial.academielouvain.be/handle/boreal:107876.

    Google Scholar 

  10. Y. E. Nesterov, “Efficiency of coordinate descent methods on large scale optimization problem,” SIAM J. Optim. 22(2), 341–362 (2012); http://dial.academielouvain.be/handle/boreal:121612.

    Article  MATH  MathSciNet  Google Scholar 

  11. A. Juditsky, G. Lan, A. Nemirovski, and A. Shapiro, “Stochastic approximation approach to stochastic programming,” SIAM J. Optim. 19(4), 1574–1609 (2009).

    Article  MATH  MathSciNet  Google Scholar 

  12. L. G. Khachiyan, Selected Works (MTsNMO, Moscow, 2009), pp. 38–48 [in Russian].

    MATH  Google Scholar 

  13. Y. Nesterov and A. Nemirovski, “Finding the stationary states of Markov chains by iterative methods,” CORE Discussion Paper; 2012/58, 2012; http://dial.academielouvain.be/handle/boreal:122163.

    Google Scholar 

  14. B. T. Polyak and A. A. Tremba, “Regularization-based solution of the PageRank problem for large matrices,” Autom. Remote Control 73(11), 1777–1894 (2012).

    Article  MathSciNet  Google Scholar 

  15. D. Spielman, http://www.cse.cuhk.edu.hk/~chi/csc5160/notes/L07.pdf.

  16. B. S. Kashin, “Diameters of some finite-dimensional sets and classes of smooth functions,” Math. USSR Izv. 11(2), 317–333 (1977).

    Article  MATH  Google Scholar 

  17. G. Pandurangan, P. Raghavan, and E. Upfal, “Using PageRank to characterize web structure,” Internet Math. 3(1), 1–20 (2006).

    Article  MATH  MathSciNet  Google Scholar 

  18. N. S. Bakhvalov, N. P. Zhidkov, and G. M. Kobel’kov, Numerical Methods (Binom, Moscow, 2002).

    Google Scholar 

  19. H. Nikaido, Convex Structures and Economic Theory (Academic, New York, 1968; Mir, Moscow, 1972).

    MATH  Google Scholar 

  20. V. S. Ryaben’kii, Introduction to Computational Mathematics (Fizmatlit, Moscow, 2000) [in Russian].

    MATH  Google Scholar 

  21. P. Chebotarev and R. Agaev, “Forest matrices around the Laplacian matrix,” Linear Algebra Appl. 356, 253–274 (2002).

    Article  MATH  MathSciNet  Google Scholar 

  22. D. K. Faddeev and V. N. Faddeeva, Computational Methods of Linear Algebra (Freeman, San Francisco, 1963; Fizmatgiz, Moscow, 1963).

    Google Scholar 

  23. M. A. Krasnosel’skii and S. G. Krein, “Remark on the distribution of errors in the solution of a system of linear equations by means of an iterative process,” Usp. Mat. Nauk 7(4(50)), 157–161 (1952).

    MATH  MathSciNet  Google Scholar 

  24. R. A. Horn and C. R. Johnson, Matrix Analysis (Cambridge Univ. Press, Cambridge, 1985; Mir, Moscow, 1989).

    Book  MATH  Google Scholar 

  25. S. M. Ermakov, Monte Carlo Method in Computational Mathematics: An Introductory Course (Binom, Moscow, 2009).

    Google Scholar 

  26. A. M. Raigorodskii, Internet Models (Intellekt, Dolgoprudnyi, 2013).

    Google Scholar 

  27. D. Knuth and A. Yao, “The complexity of nonuniform random number generation,” Algorithms and Complexity: New Directions and Recent Results (Academic, New York, 1976), pp. 357–428.

    Google Scholar 

  28. A. Sinclair and M. Jerrum, “Approximate counting, uniform generation and rapidly mixing Markov chains,” Inf. Comput. 82(1), 93–133 (1989).

    Article  MATH  MathSciNet  Google Scholar 

  29. M. Dyer, A. Frieze, and R. Kannan, “A random polynomial-time algorithm for approximating of the volume of convex bodies,” J. ACM 38(1), 1–17 (1991).

    Article  MATH  MathSciNet  Google Scholar 

  30. M. Jerrum and A. Sinclair, “The Markov chain Monte Carlo method: An approach to approximate counting and integration,” Approximation Algorithms for NP-Hard Problems, Ed. by D. S. Hochbaum (PWS, Boston, 1996), pp. 482–520; http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.9553.

    Google Scholar 

  31. P. Lezaud, “Chernoff-type bound for finite Markov chains,” Ann. Appl. Probab. 8(3), 849–867 (1998).

    Article  MATH  MathSciNet  Google Scholar 

  32. D. Aldous and J. Fill, Reversible Markov Chains and Random Walks on Graphs (2002); http://stat-www.berkeley.edu/users/aldous/RWG/book.html.

    Google Scholar 

  33. Fan Chung. “Laplacians and the Cheeger inequality for directed graphs,” Ann. Combin., No. 9, 1–19 (2005); http://math.ucsd.edu/~fan/.

    Google Scholar 

  34. P. Diaconis, “The Markov chain Monte Carlo revolution,” Bull. AMS (New Ser.) 49(2), 179–205 (2009); http://math.uchicago.edu/~shmuel/Network-course-readings/MCMCRev.pdf.

    MathSciNet  Google Scholar 

  35. D. Spielman, Lecture No. 2, 2009; http://www.cse.cuhk.edu.hk/~chi/csc5160/notes/L02.pdf.

    Google Scholar 

  36. A. Joulin and Y. Ollivier, “Curvature, concentration and error estimates for Markov chain Monte Carlo,” Ann. Probab. 38(6), 2418–2442 (2010).

    Article  MATH  MathSciNet  Google Scholar 

  37. R. Montenegro and P. Tetali, Mathematical Aspects of Mixing Times in Markov Chains (2006); http://people.math.gatech.edu/~tetali/PUBLIS/survey.pdf.

    Google Scholar 

  38. S. Boyd and L. Vandenberghe, Convex Optimization (Cambridge Univ. Press, Cambridge, 2004).

    Book  MATH  Google Scholar 

  39. S. P. Meyn and R. L. Tweedie, Markov Chains and Stochastic Stability (Springer, London, 2005); http://probability.ca/MT/.

    Google Scholar 

  40. D. Paulin, “Concentration inequalities for Markov chains by Marton couplings,” e-print, arXiv:1212.2015v2, 2013.

    Google Scholar 

  41. M. Ledoux, Concentration of Measure Phenomenon (Am. Math. Soc., Providence, RI, 2001).

    MATH  Google Scholar 

  42. A. V. Gasnikov and E. V. Gasnikova, “On entropy similar functionals arising in stochastic chemical kinetics at invariant measure concentration and as Lyapunov functions of quasi-mean dynamics,” Mat. Zametki 93(6), 816–824 (2013).

    Google Scholar 

  43. M. A. Krasnosel’skii, E. A. Lifshits, and A. V. Sobolev, Positive Linear Systems: Method of Positive Operators (Nauka, Moscow, 1985) [in Russian].

    MATH  Google Scholar 

  44. M. Ya. Kel’bert and Yu. M. Sukhov, Probability and Statistics with Examples and Problems (MTsNMO, Moscow, 2010), Vol. 2 [in Russian].

    Google Scholar 

  45. E. Yu. Klochkov, “Parametric estimation in Transportation Problems,” Bachelor’s Thesis (Moscow Inst. Phys. Eng., Dolgoprudnyi, 2013).

    Google Scholar 

  46. V. Spokoiny, “Parametric estimation: Finite sample theory,” Ann. Stat. 40(6), 2877–2909 (2012).

    Article  MATH  MathSciNet  Google Scholar 

  47. I. N. Sanov, “On the probability of large deviations of random variables,” Mat. Sb. 42(1), 11–44 (1957).

    MathSciNet  Google Scholar 

  48. S. Boucheron, G. Lugosi, and P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence (Oxford Univ. Press, London, 2013).

    Book  Google Scholar 

  49. A. Romashchenko, Expanders: Construction and Applications (Computer Science Club, Mosk. Gos, Univ., Moscow, 2009); http://www.mccme.ru/~anromash/courses/expanders2009.pdf.

    Google Scholar 

  50. A. S. Nemirovski and D. B. Yudin, Complexity of Problems and Efficiency of Optimization Methods (Nauka, Moscow, 1979) [in Russian].

    Google Scholar 

  51. D. Paulin, “Nonasymptotic confidence interval for MCMC,” e-print, arXiv:1212.2016v4, 2013.

    Google Scholar 

  52. A. V. Gasnikov, Yu. E. Nesterov, and V. G. Spokoiny, “On the efficiency of a randomization method for mirror descent in online optimization problems,” Comput. Math. Math. Phys. (2015) [in press].

    Google Scholar 

  53. Y. Mansour, Algorithmic Game Theory and Machine Learning (2011); www.tau.ac.il/~mansour/advancedagt+ml/.

    Google Scholar 

  54. S. Bubeck, Introduction to Online Optimization: Lecture Notes (Princeton University, New York, 2011); http://www.princeton.edu/~sbubeck/BubeckLectureNotes.pdf.

    Google Scholar 

  55. Y. Nesterov, “Smooth minimization of nonsmooth function,” Math. Program. Ser. A 103(1), 127–152 (2005).

    Article  MATH  MathSciNet  Google Scholar 

  56. Yu. E. Nesterov, Introduction to Convex Optimization (Mosk. Tsentr Neprer. Mat. Obrazovan., Moscow, 2010) [in Russian].

    Google Scholar 

  57. http://www2.isye.gatech.edu/~nemirovs/.

  58. Y. Nesterov, “Primal-dual subgradient methods for convex problems,” Math. Program. Ser. B 120, 221–259 (2009).

    Article  MATH  MathSciNet  Google Scholar 

  59. G. G. Magaril-Il’yaev and V. M. Tikhomirov, Convex Analysis: Theory and Applications (Am. Math. Soc., Providence, 2003; URSS, Moscow, 2011).

    Google Scholar 

  60. V. V. V’yugin, Mathematical Foundations of Machine Learning Theory and Prediction (Mosk. Tsentr Neprer. Mat. Obrazovan., Moscow, 2013); http://www.iitp.ru/upload/publications/6256/vyugin1.pdf.

    Google Scholar 

  61. G. Lugosi and N. Cesa-Bianchi, Prediction, Learning, and Games (Cambridge Univ. Press, New York, 2006).

    MATH  Google Scholar 

  62. A. B. Juditsky, A. V. Nazin, A. B. Tsybakov, and N. Vayatis, “Recursive aggregation of estimators by the mirror descent algorithm with averaging,” Probl. Inf. Transmission 41(4), 368–384 (2005).

    Article  MATH  Google Scholar 

  63. A. Juditsky, A. V. Nazin, A. B. Tsybakov, and N. Vayatis, “Gap-free bounds for stochastic multi-armed bandit,” Proceedings of the 17th IFAC World Congress, Seoul, Korea, July 6–11 2008 (IFAC, 2008), pp. 11560–11563.

    Google Scholar 

  64. S. P. Andersen, A. de Palma, and J.-F. Thisse, Discrete Choice Theory of Product Differentiation (MIT Press, Cambridge, 1992).

    Google Scholar 

  65. A. Juditsky and B. Polyak, “Robust eigenvector of stochastic matrix with application to PageRang,” Proceedings of the 51st IEEE Conference on Decision and Control, December 10–13, 2012, Maui, Hawaii, USA (IEEE, 2012), pp. 3171–3176 (arXiv:1206.4897).

    Chapter  Google Scholar 

  66. A. Tremba and A. Nazin, “Extension of a saddle point mirror descent algorithm with application to robust PageRank,” Proceedings of the 52nd IEEE Conference on Decision and Control, December 10–13, 2013, Florence, Italy (IEEE, 2013), pp. 3691–3696.

    Google Scholar 

  67. R. Tempo, http://staff.polito.it/roberto.tempo/.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. V. Gasnikov.

Additional information

Original Russian Text © A.V. Gasnikov, D.Yu. Dmitriev, 2015, published in Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 2015, Vol. 55, No. 3, pp. 355–371.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gasnikov, A.V., Dmitriev, D.Y. On efficient randomized algorithms for finding the PageRank vector. Comput. Math. and Math. Phys. 55, 349–365 (2015). https://doi.org/10.1134/S0965542515030069

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0965542515030069

Keywords

Navigation