Journal of Scientific Computing

, Volume 68, Issue 2, pp 803–825 | Cite as

Analysis and Practical Use of Flexible BiCGStab

Article

Abstract

A flexible version of the BiCGStab algorithm for solving a linear system of equations is analyzed. We show that under variable preconditioning, the perturbation to the outer residual norm is of the same order as that to the application of the preconditioner. Hence, in order to maintain a similar convergence behavior to BiCGStab while reducing the preconditioning cost, the flexible version can be used with a moderate tolerance in the preconditioning Krylov solves. We explored the use of flexible BiCGStab in a large-scale reacting flow application, PFLOTRAN, and showed that the use of a variable multigrid preconditioner significantly accelerates the simulation time on extreme-scale computers using \(O(10^4)\)\(O(10^5)\) processor cores.

Keywords

Krylov method BiCGStab Variable preconditioning Extreme-scale simulation 

References

  1. 1.
    ALCF: Intrepid supercomputer. http://www.alcf.anl.gov/intrepid
  2. 2.
    Andre, B., Bisht, G., Collier, N., Hammond, G., Karra, S., Kumar, J., Lichtner, P., Mills, R.: PFLOTRAN project. http://pflotran.org/
  3. 3.
    Ang, J., Evans, K., Geist, A., Heroux, M., Hovland, P., Marques, O., McInnes, L., Ng, E., Wild, S.: Report on the workshop on extreme-scale solvers: Transitions to future architectures. Office of Advanced Scientific Computing Research, U.S. Department of Energy (2012). URL http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/reportExtremeScaleSolvers2012.pdf Washington, DC, March 8-9, 2012
  4. 4.
    Axelsson, O., Vassilevski, P.S.: A black box generalized conjugate gradient solver with inner iterations and variable-step preconditioning. SIAM J. Matrix Anal. Appl. 12(4), 625–644 (1991)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11-Revision 3.5, Argonne National Laboratory (2014) URL http://www.mcs.anl.gov/petsc
  6. 6.
    Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhauser Press (1997). URL ftp://info.mcs.anl.gov/pub/tech_reports/reports/P634.ps.Z
  7. 7.
    Bouras, A., Frayssé, V.: Inexact matrix-vector products in Krylov methods for solving linear systems: a relaxation strategy. SIAM J. Matrix Anal. Appl. 26(3), 660–678 (2005)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Bridges, P.G., Ferreira, K.B., Heroux, M.A., Hoemmen, M.: Fault-tolerant linear solvers via selective reliability. CoRR arXiv:1206.1390 (2012)
  9. 9.
    Brown, J., Knepley, M.G., May, D.A., McInnes, L.C., Smith, B.F.: Composable linear solvers for multiphysics. In: Proceeedings of the 11th international symposium on parallel and distributed computing (ISPDC 2012), pp. 55–62. IEEE Computer Society (2012). URL http://doi.ieeecomputersociety.org/10.1109/ISPDC.2012.16
  10. 10.
    Chronopoulos, A., Gear, C.W.: S-step iterative methods for symmetric linear systems. J. Comput. Appl. Math. 25, 153–168 (1989)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    El maliki, A., Guenette, R., Fortin, M.: An efficient hierarchical preconditioner for quadratic discretizations of finite element problems. Numer. Linear Algebra Appl. 18(5), 789–803 (2011). doi:10.1002/nla.757 MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Eshof, Jv, Sleijpen, G.L.G.: Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 26(1), 125–153 (2004)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Fletcher, R.: Conjugate gradient methods for indefinite systems. Lect. Notes Math. 506, 73–89 (1976)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Ghysels, P., Ashby, T., Meerbergen, K., Vanroose, W.: Hiding global communication latency in the GMRES algorithm on massively parallel machines. Tech. report 04.2012.1, Intel Exascience Lab, Leuven, Belgium (2012). URL http://twna.ua.ac.be/sites/twna.ua.ac.be/files/latency_gmres.pdf
  15. 15.
    Giladi, E., Golub, G.H., Keller, J.B.: Inner and outer iterations for the Chebyshev algorithm. SIAM J. Numer. Anal. 35, 300–319 (1995)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Golub, G.H., Ye, Q.: Inexact preconditioned conjugate gradient method with inner-outer iteration. SIAM J. Sci. Comput. 21(4), 1305–1320 (1999)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Keyes, D.E., McInnes, L.C., Woodward, C., Gropp, W., Myra, E., Pernice, M., Bell, J., Brown, J., Clo, A., Connors, J., Constantinescu, E., Estep, D., Evans, K., Farhat, C., Hakim, A., Hammond, G., Hansen, G., Hill, J., Isaac, T., Jiao, X., Jordan, K., Kaushik, D., Kaxiras, E., Koniges, A., Lee, K., Lott, A., Lu, Q., Magerlein, J., Maxwell, R., McCourt, M., Mehl, M., Pawlowski, R., Randles, A.P., Reynolds, D., Rivière, B., Rüde, U., Scheibe, T., Shadid, J., Sheehan, B., Shephard, M., Siegel, A., Smith, B., Tang, X., Wilson, C., Wohlmuth, B.: Multiphysics simulations: challenges and opportunities. Int. J. High Perform. Comput. Appl. 27(1), 4–83 (2013). URL http://www.ipd.anl.gov/anlpubs/2012/01/72183.pdf
  18. 18.
    McInnes, L.C., Smith, B., Zhang, H., Mills, R.T.: Hierarchical Krylov and nested Krylov methods for extreme-scale computing. Parallel Comput. 40, 17–31 (2014). doi:10.1016/j.parco.2013.10.001 MathSciNetCrossRefGoogle Scholar
  19. 19.
    Mills, R.T., Sripathi, V., Mahinthakumar, G., Hammond, G., Lichtner, P.C., Smith, B.F.: Engineering PFLOTRAN for scalable performance on Cray XT and IBM BlueGene architectures. In: Proceedings of SciDAC 2010 Annual Meeting (2010)Google Scholar
  20. 20.
    Mohiyuddin, M., Hoemmen, M., Demmel, J., Yelick, K.: Minimizing communication in sparse matrix solvers. In: Proceedings of SC09. ACM (2009). doi:10.1145/1654059.1654096
  21. 21.
    Notay, Y.: Flexible conjugate gradients. SIAM J. Sci. Comput. 22(4), 1444–1460 (2000)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
  23. 23.
    van Rosendale, J.: Minimizing inner product data dependencies in conjugate gradient iteration. In: Proceedings of the IEEE international conference on parallel processing. IEEE computer society (1983)Google Scholar
  24. 24.
    Saad, Y.: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Comput. 14(2), 461–469 (1993). doi:10.1137/0914028 MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelpha (2003)CrossRefMATHGoogle Scholar
  26. 26.
    Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856–869 (1986)MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Saad, Y., Sosonkina, M.: pARMS: a package for the parallel iterative solution of general large sparse linear systems user’s guide. Tech. Rep. UMSI2004-8, Minnesota Supercomputer Institute, University of Minnesota (2004)Google Scholar
  28. 28.
    Shalf, J., Dosanjh, S., Morrison, J.: Exascale computing technology challenges. In: Palma, J.M.L.M., et al. (eds.) VECPAR 2010, LNCS 6449, pp. 1–25 (2010)Google Scholar
  29. 29.
    Simoncini, V., Szyld, D.: Flexible inner-outer Krylov subspace methods. SIAM J. Numer. Anal. 40(6), 2219–2239 (2003)Google Scholar
  30. 30.
    Simoncini, V., Szyld, D.B.: Theory of inexact Krylov subspace methods and applications to scientific computing. SIAM J. Sci. Comput. 25(2), 454–477 (2003)MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Sleijpen, G.L., van Gijzen, M.B.: Exploiting BiCGstab(\(\ell \)) strategies to induce dimension reduction. SIAM J. Sci. Comput. 32(5), 2687–2709 (2010)MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    Sleijpen, G.L., Sonneveld, P., van Gijzen, M.B.: Bi-CGSTAB as an induced dimension reduction method. Appl. Numer. Math. 60, 1100–1114 (2010)MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Sonneveld, P., van Gijzen, M.B.: IDR(s): a family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM J. Sci. Comput. 31(2), 1035–1062 (2008)MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    Sturler, E.D., van der Vorst, H.A.: Reducing the effect of global communication in GMRES(m) and CG on parallel distributed memory computers. Appl. Numer. Math. 18, 441–459 (1995)CrossRefMATHGoogle Scholar
  35. 35.
    Szyld, D.B., Vogel, J.A.: FQMR: a flexible quasi-minimal residual method with inexact preconditioning. SIAM J. Sci. Comput. 23(2), 363–380 (2001)MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    van der Vorst, H.: BiCGSTAB: a fast and smoothly converging variant of BiCG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13, 631–644 (1992)CrossRefMATHGoogle Scholar
  37. 37.
    Van der Vorst, H.A., Vuik, C.: GMRESR: a family of nested GMRES methods. Numer. Linear Algebra Appl. 1(4), 369–386 (1994)MathSciNetCrossRefMATHGoogle Scholar
  38. 38.
    van Gijzen, M.B., Sleijpen, G.L., Zemke, J.P.M.: Flexible and multi-shift induced dimension reduction algorithms for solving large sparse linear systems. Tech. Rep. 11–06, Delft University of Technology (2011)Google Scholar
  39. 39.
    Vogel, J.A.: Flexible BiCG and flexible Bi-CGSTAB for nonsymmetric linear systems. Appl. Math. Comput. 188(1), 226–233 (2007)MathSciNetMATHGoogle Scholar
  40. 40.
    Vuduc, R.: Quantitative performance modeling of scientific computations and creating locality in numerical algorithms. Ph.D. thesis, Massachusetts Institute of Technology (1995)Google Scholar
  41. 41.
    Yang, L.T., Brent, R.: The improved BiCGStab method for large and sparse unsymmetric linear systems on parallel distributed memory architectures. In: Proceedings of the Fifth international conference on algorithms and architectures for parallel processing. IEEE (2002)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.IBM Thomas J. Watson Research CenterYorktown HeightsUSA
  2. 2.Mathematics and Computer Science DivisionArgonne National LaboratoryArgonneUSA

Personalised recommendations