Skip to main content
Log in

Accelerating the Arnoldi-Type Algorithm for the PageRank Problem and the ProteinRank Problem

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

PageRank is an algorithm for computing a ranking for every Web page based on the graph of the Web. It plays an important role in Google’s search engine. The core of the PageRank algorithm involves computing the principal eigenvector of the Google matrix. Currently, we need to solve PageRank problems with high damping factors, which cost considerable time. A possible approach for accelerating the computation is the Arnoldi-type algorithm. However, this algorithm may not be satisfactory when the damping factor is high and the dimension of the Krylov subspace is low. Even worse, it may stagnate in practice. In this paper, we propose two strategies to improve the efficiency of the Arnoldi-type algorithm. Theoretical analysis shows that the new algorithms can accelerate the original Arnoldi-type algorithm considerably, and circumvent the drawback of stagnation. Numerical experiments illustrate that the accelerated Arnoldi-type algorithms usually outperform many state-of-the-art accelerating algorithms for PageRank. Applications of the new algorithms to function predicting of proteins are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Abbott, A.: And now for the proteome. Nature 409, 747 (2001)

    Article  Google Scholar 

  2. Avrachenkov, K., Litvak, N., Nemirovsky, D., Osipova, N.: Monte carlo methods in PageRank computation: when one iteration is sufficient. SIAM J. Numer. Anal. 45, 890–904 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  3. Beattie, C., Embree, M., Sorensen, D.: Convergence of polynomial restart Krylov methods for eigenvalue computation. SIAM Rev. 47, 492–515 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bellalij, M., Saad, Y., Sadok, H.: Further analysis of the Arnoldi process for eigenvalue problems. SIAM J. Numer. Anal. 48, 393–407 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  5. Berkhin, P.: A survey on PageRank computing. Internet Math. 2, 73–120 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  6. Berman, A., Plemmons, R.: Nonnegative Matrices in the Mathematical Sciences, 2nd edn. SIAM, Philadelphia (1994)

    Book  MATH  Google Scholar 

  7. Boldi, P., Santini, M., Vigna, S.: PageRank: functional dependencies. ACM Trans. Inf. Syst. 27(1) (2009)

  8. Brezinski, C., Redivo-Zaglia, M.: Rational extrapolation for the PageRank vector. Math. Comput. 77, 1585–1598 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  9. Chen, Z., Cai, Z., Li, M., Liu, B.: Using search engine technology for protein function prediction. Inter. J. Bio. Res. Appl. 7, 101–113 (2011)

    Article  Google Scholar 

  10. Cicone, A., Serra-Capizzano, S.: Google PageRanking problem: the model and the analysis. J. Comput. Appl. Math. 234, 3140–3169 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cipra, B.: The best of the 20th century: editors name top 10 algorithms. SIAM News 33(4) (2000)

  12. Constantine, P., Gleich, D.: Random alpha PageRank. Internet Math. 6, 189–236 (2010)

    Article  MathSciNet  Google Scholar 

  13. Del Corso, G., Gullì, A., Romani, F.: Comparison of Krylov subspace methods on the PageRank problem. J. Comput. Appl. Math. 210, 159–166 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  14. Freschi, V.: Protein function prediction from interaction networks using a random walk ranking algorithm. In: Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, 14–17, pp. 42–48 (2007)

  15. Gleich, D., Zhukov, L., Berkhin, P.: Fast Parallel PageRank: A Linear System Approach. Technical Report, Yahoo! (2004)

  16. Gleich, D., Gray, A., Greif, C., Lau, T.: An inner-outer iteration for computing PageRank. SIAM J. Sci. Comput. 32, 349–371 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  17. Golub, G.H., Greif, C.: An Arnoldi-type algorithm for computing PageRank. BIT 46, 759–771 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  18. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  19. Grindrod, P.: Range-dependent random graphs and their application to modelling large small-world proteome datasets. Phys. Rev E. 66, 066702 (2002)

    Article  Google Scholar 

  20. Haveliwala, T., Kamvar, S.: The second eigenvalue of the Google matrix. Stanford University Technical Report (2003)

  21. Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)

    Book  MATH  Google Scholar 

  22. Ipsen, I., Selee, T.: PageRank computation, with special attention to dangling nodes. SIAM J. Matrix Anal. Appl. 29, 1281–1296 (2007)

    Article  MathSciNet  Google Scholar 

  23. Jia, Z.: Refined iterative algorithms based on Arnoldi’s process for large unsymmetric eigenproblems. Linear Algeb. Appl. 259, 1–23 (1997)

    Article  MATH  Google Scholar 

  24. Jia, Z., Stewart, G.W.: An analysis of the Rayleigh-Ritz method for approximating eigenspaces. Math. Comput. 70, 637–647 (2001)

    MathSciNet  MATH  Google Scholar 

  25. Kamvar, S., Haveliwala, T., Manning, C., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Proceedings of the 12th Conference on International, World Wide Web (2003)

  26. Kamvar, S., Haveliwala, T., Golub, G.H.: Adaptive methods for the computation of PageRank. Linear Algeb. Appl. 386, 51–65 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  27. Kollias, G., Gallopoulos E.: Functional rankings with multidamping: Generalizing PageRank with inhomogeneous matrix products (submitted)

  28. Langville, A., Meyer, C.: Deeper inside PageRank. Internet Math. 1, 335–380 (2005)

    Article  MathSciNet  Google Scholar 

  29. Langville, A., Meyer, C.: Google’s PageRank and beyond: the science of search engine rankings. Princeton University Press, Princeton (2006)

    Google Scholar 

  30. Lee, C., Golub, G.H., Zenios, S.: A Fast Two-Stage Algorithm for Computing PageRank and Its Extensions. Stanford University Technical Report, SCCM-03-15 (2003)

  31. Manteuffel, T.: Adaptive procedure for estimating parameters for the nonsymmetric Tchebychev iteration. Numer. Math. 31, 183–208 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  32. Moler, C.: The World’s Largest Matrix Computation. MATLAB News and Notes (2002)

  33. Morrison, J., Breitling, R., Higham, D., Gilbert, D.: A lock-and-key model for protein-protein interactions. Bioinformatics 2, 2012–2019 (2006)

    Article  Google Scholar 

  34. Nachtigal, N., Reichel, L., Trefethen, L.: A hybrid GMRES algorithm for nonsymmetric linear systems. SIAM J. Matrix Anal. Appl. 13, 796–825 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  35. Page, L., Brin, S., Motwami, R., Winograd, T.: The PageRank citation ranking: bring order to the Web, Technical Report. Computer Science Department, Stanford University (1998)

  36. Parlett, B.N.: A recurrence among the elements of functions of triangular matrices. Linear Algeb. Appl. 14, 117–121 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  37. Saad, Y.: Chebyshev acceleration techniques for solving nonsymmetric eigenvalue problems. Math. Comput. 42, 567–588 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  38. Saad, Y.: Numerical Methods for Large Eigenvalue Problems, Algorithms and Architectures for Advanced Scientific Computing. Manchester University Press, Manchester (1992)

    Google Scholar 

  39. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)

    Book  MATH  Google Scholar 

  40. Serra-Capizzano, S.: Jordan canonical form of the Google matrix: A potential contribution to the PageRank computation. SIAM Matrix Anal. Appl. 27, 305–312 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  41. Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007)

    Article  Google Scholar 

  42. Sidi, A., Shapira, Y.: Upper bounds for convergence rates of acceleration methods with initial iterations. Numer. Algeb. 18, 113–132 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  43. Sidi, A.: Vector extrapolation methods with applications to solution of large systems of equations and to pagerank computations. Comput. Math. Appl. 56, 1–24 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  44. Sorensen, D.: Implicit application of polynomial filters in a $k$-step Arnoldi method. SIAM J. Matrix Anal. Appl. 13, 357–385 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  45. Stewart, G.W., Sun, J.: Matrix Perturbation Theory. Academic Press, Boston (1990)

    MATH  Google Scholar 

  46. Taylor, A., Higham, D.J.: CONTEST: A controllable test matrix toolbox for MATLAB. ACM Trans. Math. Soft. 35(26) (2009)

  47. Wong, L.: Using biological networks in protein function prediction and gene expression analysis. Internet Math. 7, 274–298 (2011)

    Article  MathSciNet  Google Scholar 

  48. Wu, G., Wei, Y.: A Power-Arnoldi algorithm for computing PageRank. Numer. Linear Algeb. Appl. 14, 521–546 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  49. Wu, G., Wei, Y.: Arnoldi versus GMRES for computing PageRank: a theoretical contribution to Google’s PageRank problem. ACM Trans Inf. Syst. 28(11) (2010)

  50. Wu, G., Wei, Y.: An Arnoldi-Extrapolation algorithm for computing PageRank. J. Comput. Appl. Math. 234, 3196–3212 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  51. Wu, K., Simon, H.: Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22, 602–616 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  52. Wu, G., Zhang, Y., Wei, Y.: Krylov subspace algorithms for computing GeneRank for the analysis of microarray data mining. J. Comput. Biol. 17, 631–646 (2010)

    Article  MathSciNet  Google Scholar 

  53. Wu, G., Wang, Y., Jin, X.: A preconditioned and shifted GMRES algorithm for the PageRank Problem with multiple damping factors. SIAM J. Sci. Comput. 34, A2558–A2575 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  54. Yu, Q., Miao, Z., Wu, G., Wei, Y.: Lumping algorithms for computing Google’s Page-Rank and its derivative, with attention to unreferenced nodes. Inform. Retriev. 15, 503–526 (2012)

    Article  Google Scholar 

  55. Zavorin, I., O’Leary, D., Elman, H.: Complete stagnation of GMRES. Linear Algeb. Appl. 367, 165–183 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  56. Zhang, H., Goel, A., Govindan, R., Mason, K., Van Roy, B.: Making Eigenvector-Based Reputation System Robust to Collusion. www.stanford.edu/group/reputation/WAW-adapt.ps (2004)

  57. http://www.cise.ufl.edu/research/sparse/matrices/Gleich/index.html

  58. http://www.mathstat.strath.ac.uk/research/groups/numerical_analysis/contest/toolbox

Download references

Acknowledgments

Gang Wu—This author is supported by the National Science Foundation of China, the Qing-Lan Project of Jiangsu Province, and the 333 Project of Jiangsu Province. Ying Zhang—The work of this author is partially supported by the National Science Foundation of China under grant 10901132. Yimin Wei—This author is supported by the National Natural Science Foundation of China under Grant 11271084, 973 Program Project (No. 2010CB327900), Shanghai Education Committee under Dawn Project 08SG01, and Shanghai Science and Technology Committee. We would like to express our sincere thanks to two anonymous reviewers for their invaluable comments and constructive suggestions that clarify and improve several sections of this paper. Meanwhile, we are grateful to Dr. David Gleich and Professor Tim Davis for providing us with the data files of the Web matrices. We also appreciate Professor Chen Greif for providing us with the MATLAB codes of the inner-outer power iterations. Finally, Gang Wu would like to thank School of Mathematics and Statistics of Jiangsu Normal University for the use of the facilities during the development of this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yimin Wei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, G., Zhang, Y. & Wei, Y. Accelerating the Arnoldi-Type Algorithm for the PageRank Problem and the ProteinRank Problem. J Sci Comput 57, 74–104 (2013). https://doi.org/10.1007/s10915-013-9696-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10915-013-9696-x

Keywords

AMS Subject Classification

Navigation