Abstract
We studied a fast algorithm for the large-scale computation of PageRank. PageRank is what the Google search engine uses to simulate the importance of web pages. It is defined by the eigenvector of a particular stochastic matrix related to the graphs of web pages. The power method is the typical means to compute the eigenvector, while the Krylov subspace method shows faster convergence, which can be regarded as a two-step algorithm. The first step predicts the eigenvector, and the second step corrects the predicted result. More precisely, the power method is first iterated to compute the eigenvector approximately. Secondly, a Krylov subspace spanned by the approximations is searched for a better approximate eigenvector in terms of minimizing a residual. To get a better approximation efficiently, we consider using subspaces not only at the second step but also at the first step. Specifically, a Krylov subspace is first used to compute an approximate eigenvector, by which another subspace is expanded. Secondly, this non-Krylov subspace is searched for a better approximate eigenvector that minimizes its residual over the subspace. This paper describes a heuristic search algorithm iterating the two steps alternately and presents its efficient implementation. Experimental results with huge Google matrices illustrate improvements in performance of the algorithm.
This is a preview of subscription content, access via your institution.




References
Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Stanford University Technical Report 1999-66
Langville AN, Meyer CD (2003) Deeper inside PageRank. Internet Math 1(3):335–380
Langville AN, Meyer CD (2006) Google’s PageRank and beyond: the science of search engine rankings. Princeton University Press, Princeton
Eldén L (2006) Numerical linear algebra in data mining. Acta Numer 15:327–384
Eldén L (2007) Matrix methods in data mining and pattern recognition. SIAM, Philadelphia
Kamvar SD (2010) Numerical algorithms for personalized search in self-organizing information networks. Princeton University Press, Princeton
Moler C (2011) Experiments with MATLAB. Electronic edition published by MathWorks. http://www.mathworks.com/moler
Gleich DF (2015) PageRank beyond the web. SIAM Rev 57(3):321–363
Kamvar SD, Haveliwala TH, Golub GH (2003) Adaptive methods for the computation of PageRank. Stanford University Technical Report 2003-26
Kamvar SD, Haveliwala TH, Manning CD, Golub GH (2003) Extrapolation methods for accelerating PageRank computations. In: Proceedings of the 12th International Conference on World Wide Web
Haveliwala TH, Kamvar SD, Klein D, Manning CD, Golub GH (2003) Computing PageRank using power extrapolation. Stanford University Technical Report 2003-45
Golub GH, Greif C (2006) An Arnoldi-type algorithm for computing page rank. BIT 46(4):759–771
Arnal J, Migallón H, Migallón V, Palomino JA, Penadés J (2014) Parallel relaxed and extrapolated algorithms for computing PageRank. J Supercomput 70(2):637–648
Tan X (2017) A new extrapolation method for PageRank computations. J Comput Appl Math 313:383–392
Migallón H, Migallón V, Palomino JA, Penadés J (2016) A heuristic relaxed extrapolated algorithm for accelerating PageRank. Adv Eng Softw. https://doi.org/10.1016/j.advengsoft.2016.01.024
Golub GH, Loan CFV (2012) Matrix computations. SIAM, Philadelphia
LAPACK—Linear Algebra PACKage. http://www.netlib.org/lapack/
Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J, Du Croz J, Greenbaum A, Hammarling S, McKenney A, Sorensen D (1999) LAPACK users’ guide. SIAM, Philadelphia
Haveliwala TH, Kamvar SD (2003) The second eigenvalue of the Google matrix. Stanford University Technical Report 2003-20
Arnoldi WE (1951) The principle of minimized iterations in the solution of the matrix eigenvalue problem. Q Appl Math 9(1):17–29
Wilkinson JH (1988) The algebraic eigenvalue problem. Oxford University Press, Oxford
Bai Z, Demmel J, Dongarra J, Ruhe A, Vorst H (2000) Templates for the solution of algebraic eigenvalue problems: a practical guide. SIAM, Philadelphia
Davis TA, Hu Y (2011) The university of Florida sparse matrix collection. ACM Trans Math Softw 38(1):1–25. Available as the SuiteSparse matrix collection. http://www.cise.ufl.edu/research/sparse/matrices/
OpenMP application programming interface examples ver. 4.5.0. http://www.openmp.org/wp-content/uploads/openmp-examples-4.5.0.pdf
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by KAKENHI Grant No. 18K11343.
Rights and permissions
About this article
Cite this article
Miyata, T. A heuristic search algorithm based on subspaces for PageRank computation. J Supercomput 74, 3278–3294 (2018). https://doi.org/10.1007/s11227-018-2383-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2383-9
Keywords
- PageRank
- Google matrix
- Power iteration
- Krylov subspace
- Residual minimization
- Parallel computing