Advertisement

International Journal of Parallel Programming

, Volume 43, Issue 6, pp 1028–1053 | Cite as

PageRank Computation Using a Multiple Implicitly Restarted Arnoldi Method for Modeling Epidemic Spread

  • Zifan LiuEmail author
  • Nahid Emad
  • Soufian Ben Amor
  • Michel Lamure
Article

Abstract

A parallel implementation based on implicitly restarted Arnoldi method (MIRAM) is proposed for calculating dominant eigenpair of stochastic matrices derived from very large real networks. Their high damping factor makes many existing algorithms less efficient, while MIRAM could be promising. Also, we apply this method in an epidemic application. We describe in this paper a stochastic model based on PageRank to simulate the epidemic spread, where a PageRank-like infection vector is calculated by MIRAM to help establish efficient vaccination strategy. MIRAM is implemented within the framework of Trilinos, targeting big data and sparse matrices representing scale-free networks, also known as power law networks. Hypergraph partitioning approach is employed to minimize the communication overhead. The algorithm is tested on a nation wide cluster of clusters Grid5000. Experiments on very large networks such as twitter and yahoo with over 1 billion nodes are conducted. With our parallel implementation, a speedup of \(27\times \) is met compared to the sequential solver.

Keywords

Epidemic PageRank Scale free networks Power law IRAM  Big data Hypergraph partitioning 

Notes

Acknowledgments

We would like to thank Fabrcio Benevenuto from Federal University of Ouro Preto for the \(twitter\) network, Kim Capps from Yahoo! Labs for his help to get access to Alta Vista web network.

References

  1. 1.
    Liu, Z., Emad, N., Amor, S.B., Lamure, M.: Towards modeling of epidemic spread: eigenvalue computation. Preprint for publication. URL:http://hal.archives-ouvertes.fr/hal-01069010
  2. 2.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank citation ranking: bringing order to the Web. Technical Report 1999–66, Stanford InfoLab (1999)Google Scholar
  3. 3.
    Bryan, K., Leise, T.: The \({\$}\)25,000,000,000 eigenvector: The linear Algebra behind Google. SIAM Rev. 48(3), 569–581 (2006). doi: 10.1137/050623280. ISSN:0036-1445
  4. 4.
    Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond: the Science of Search Engine Rankings. Princeton University Press, Princeton, NJ, USA. ISBN:0691122024 (2006)Google Scholar
  5. 5.
    Berkhin, P.: A survey on pagerank computing. Internet Math. 2, 73–120 (2005)zbMATHMathSciNetCrossRefGoogle Scholar
  6. 6.
    Golub, G.H., Greif, C.: An Arnoldi-type algorithm for computing PageRank. BIT Numer. Math. 46(4), 759–771 (2006)zbMATHMathSciNetCrossRefGoogle Scholar
  7. 7.
    Wu, G., Wei, Y.: An Arnoldi-extrapolation algorithm for computing PageRank. J. Comput. App. Math. 234(11), 3196–3212 (2010) (Numerical linear algebra, internet and large scale applications). ISSN:0377-0427. doi: 10.1016/j.cam.2010.02.009. URL:http://www.sciencedirect.com/science/article/pii/S0377042710000804
  8. 8.
    Gleich, D., Zhukov, L., Berkhin, P.: Fast parallel PageRank: a linear system approach. Technical Report L-2004-038, Yahoo! Research Labs (2004)Google Scholar
  9. 9.
    Wu, G., Wei, Y.: Arnoldi versus GMRES for computing PageRank: a theoretical contribution to Google’s PageRank problem. ACM Trans. Inf. Syst. 28(3), 11:1–11:28 (2010). ISSN:1046–8188. doi: 10.1145/1777432.1777434
  10. 10.
    Wu, G., Wang, Y.-C., Jin, X.-Q.: A preconditioned and shifted GMRES algorithm for the PageRank problem with multiple damping factors. SIAM J. Sci. Comput. 34(5) (2012)Google Scholar
  11. 11.
    Haveliwala, T.H., Kamvar, S.D., Kamvar, A.D.: The second eigenvalue of the Google matrix. Technical Report 2003-20, Stanford InfoLab (2003)Google Scholar
  12. 12.
    Liu, Z., Emad, N., Amor, S.B., Lamure, M.: A parallel IRAM algorithm to compute PageRank for modeling epidemic spread. Symp. Comput. Architect. High Perform. Comput. 0, 120–127 (2013). doi: 10.1109/SBAC-PAD.2013.2 Google Scholar
  13. 13.
    Fazeli, S.A.S., Emad, N., Liu, Z.: A key to choose subspace size in implicitly restarted Arnoldi method. J. Numer. Algorithm (2014). http://hal.archives-ouvertes.fr/hal-01070577
  14. 14.
    Heroux, M., Bartlett, R., Hoekstra, V.H.R., Hu, J., Kolda, T., Lehoucq, R., Long, K., Pawlowski, R., Phipps, E., Salinger, A., Thornquist, H., Tuminaro, R., Willenbring, J., Williams, A.: An overview of Trilinos. Technical Report SAND2003-2927, Sandia National Laboratories (2003)Google Scholar
  15. 15.
    Catalyurek, U., Aykanat, C.: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans. Parallel Distrib. Syst., 10(7), 673–693 (1999). doi: 10.1109/71.780863. ISSN 1045-9219
  16. 16.
    Marathe, M., Vullikanti, A.K.S.: Computational epidemiology. Commun. ACM 56(7), 88–96 (2013). ISSN:0001-0782. doi: 10.1145/2483852.2483871
  17. 17.
    Bisset, K., Chen, J., Feng, X., Anil Kumar, V.S., Marathe, M.: EpiFast: A fast algorithm for large scale realistic epidemic simulations on distributed memory systems. In: Proceedings of 23rd ACM International Conference on Supercomputing (ICS’09), pp. 430–439 (2009)Google Scholar
  18. 18.
    Bisset, K.: Urgent computing for interaction based socio-technical simulations. Invited presentation to Argonne National Laboratory, AprilGoogle Scholar
  19. 19.
    Chao, D.L., Halloran, M.E., Obenchain, V.J., Longini, I.M., Flu Jr, T.E.: A publicly available stochastic influenza epidemic simulation model. PLoS Comput. Biol. 6(1), e1000656, 01 (2010). doi: 10.1371/journal.pcbi.1000656 CrossRefGoogle Scholar
  20. 20.
    Wang, Y., Chakrabarti, D., Wang, C., Faloutsos, C.: Epidemic spreading in real networks: an eigenvalue viewpoint. In: SRDS, pp. 25–34 (2003)Google Scholar
  21. 21.
    Miller, J.C., Hyman, J.M.: Effective vaccination strategies for realistic social networks. Phys. A 386(2), 780–785 (2007)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Fan, R.K.: Chung, Paul Horn, and Alexander Tsiatas. Distributing Antidote Using PageRank Vectors. Internet Math. 6(2), 237–254 (2009)zbMATHMathSciNetCrossRefGoogle Scholar
  23. 23.
    Barabási, A., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Lee, C.P., Golub, G.H., Zenios, S.A.: A fast two-stage algorithm for computing PageRank and its extensions. Technical report, Stanford University. URL:http://www-sccm.stanford.edu/pub/sccm/sccm03-15_2.pdf (2004)
  25. 25.
    Ipsen, I.C.F., Selee, T.M.: PageRank computation, with special attention to dangling nodes. SIAM J. Matrix Anal. Appl., 29(4), 1281–1296 (2007). doi: 10.1137/060664331. ISSN:0895-4798
  26. 26.
    Eiron, N., McCurley, K.S., Tomlin, J.A.: Ranking the web frontier. In: Proceedings of the 13th International Conference on World Wide Web, WWW ’04, pp. 309–318, New York, NY, USA. ACM (2004). ISBN:1-58113-844-X. doi: 10.1145/988672.988714
  27. 27.
    Sorensen, D.C.: Implicit application of polynomial filters in a k-step Arnoldi method. SIAM J. Matrix Anal. Appl. 13(1), 357–385 (1992). ISSN:0895–4798. doi: 10.1137/0613025
  28. 28.
    Sorensen, D.C.: Implicitly restarted Arnoldi/Lanczos methods for large scale eigenvalue calculations. Technical report (1996)Google Scholar
  29. 29.
    Sorensen, D.C.: Numerical methods for large eigenvalue problems. Acta Numer. 11, 519–584 (2002). doi: 10.1017/S0962492902000089
  30. 30.
    Watkins, D.S.: The QR algorithm revisited. SIAM Rev. 50(1), 133–145 (2008). ISSN:0036-1445. doi: 10.1137/060659454
  31. 31.
    Bennani, M., Braconnier, T.: Stopping Criteria for Eigensolvers. Technical Report TR/PA/94/22, CERFACS, Toulouse, France (1994)Google Scholar
  32. 32.
    Stathopoulos, A., Saad, Y.: Dynamic thick restarting of the Davidson, and the implicitly restarted Arnoldi methods. SIAM J. Sci. Comput. 19, 227–245 (1996)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Hendrickson, B., Leland, R.: The chaco user’s guide: Version 2.0. Technical Report SAND94-2692, Sandia National Lab (1994)Google Scholar
  34. 34.
    Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998). ISSN:1064–8275. doi: 10.1137/S1064827595287997
  35. 35.
    Pellegrini, F.: Scotch and libScotch 5.1 user’s guide. URL http://hal.archives-ouvertes.fr/hal-00410327. 127 pages User’s manual (2008)
  36. 36.
    Bradley, J.T., de Jager, D., Knottenbelt, W.J., Trifunovic, A.: Hypergraph partitioning for faster parallel PageRank computation. In: EPEW’05, Proceedings of the 2nd European Performance Evaluation Workshop, volume 3670 of Lecture Notes in Computer Science, pp. 155–171, September 2005 (2005). URL http://pubs.doc.ic.ac.uk/hypergraph-fast-pagerank/
  37. 37.
    Boman, E.G., Çatalyürek, Ü.V., Chevalier, C., Devine, K.D.: The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: partitioning, ordering and coloring. Sci. Progr. 20(2), 129–150 (2012)Google Scholar
  38. 38.
    Isorropia: Partitioning, Coloring, and Ordering. http://trilinos.org/docs/r11.8/packages/isorropia/doc/html/index.html. Trilinos Release 11.8
  39. 39.
    Bolze, R., Cappello, F., Caron, E., Daydé, M., Desprez, F., Jeannot, E., Jégou, Y., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, E.-G., Touche, I.: Grid’5000: A large scale and highly reconfigurable experimental grid testbed. Int. J. High Perform. Comput. Appl. 20(4), 481–494 (2006). ISSN:1094-3420. doi: 10.1177/1094342006070078
  40. 40.
  41. 41.
  42. 42.
    Kwak, Haewoon., Lee, Changhyun., Park, Hosung., Moon, Sue.: What is Twitter, a social network or a news media? In: WWW ’10: Proceedings of the 19th international conference on World wide web, pp. 591–600, New York, NY, USA. ACM (2010). ISBN:978-1-60558-799-8. doi: 10.1145/1772690.1772751
  43. 43.
    Romualdo, P.-S., Alessandro, V.: Epidemic spreading in scale-free networks. Phys. Rev. Lett. 86, 3200–3203 (2001). doi: 10.1103/PhysRevLett.86.3200 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Zifan Liu
    • 1
    • 2
    Email author
  • Nahid Emad
    • 1
    • 2
  • Soufian Ben Amor
    • 2
  • Michel Lamure
    • 3
  1. 1.Maison de la Simulation, USR 3441Gif-sur-YvetteFrance
  2. 2.PRiSM Laboratory, UMR 8144University of VersaillesVersaillesFrance
  3. 3.Santé, Individu, Société, EAM 4129University of Lyon 1, University of LyonLyonFrance

Personalised recommendations