Abstract
The Biconjugate Gradient (BiCG) and the Quasi-Minimal Residual (QMR) method are among the popular iterative methods for the solution of large, sparse, non-symmetric systems of linear equations. When these methods are implemented on large-scale parallel computers, their scalability is limited by the synchronization caused when carrying out inner product-like operations. Therefore, we propose two new synchronization-reducing variants of BiCG and QMR in an attempt to mitigate these negative performance effects. The idea behind these new s-step variants is to group several dot products for joint execution. Although these new algorithms still reveal numerical instabilities, they are shown to keep the cost of inner product-like operations almost independent of the number of processes, thus improving scalability significantly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fletcher, R.: Conjugate gradient methods for indefinite systems. In: Watson, G. (ed.) Numerical Analysis. LNM, vol. 506, pp. 73–89. Springer, Heidelberg (1976)
Freund, R.W., Nachtigal, N.M.: An implementation of the QMR method based on coupled two-term recurrences. SIAM J. Sci. Comput. 15(2), 313–337 (1994)
Saad, Y.: Krylov subspace methods on supercomputers. SIAM J. Sci. Stat. Comput. 10(6), 1200–1232 (1989)
van der Vorst, H.: Iterative methods for the solution of large systems of equations on supercomputers. Advances in Water Resources 13(3), 137–146 (1990)
Demmel, J., Heath, M., van der Vorst, H.: Parallel numerical linear algebra. Acta Numerica 2(1), 111–197 (1993)
Duff, I.S., van der Vorst, H.A.: Developments and trends in the parallel solution of linear systems. Parallel Computing 25(13-14), 1931–1970 (1999)
Bücker, H.M.: Iteratively solving large sparse linear systems on parallel computers. NIC Serices, John Von Neumann Institute f. Computing. Jülich 10, 521–548 (2002)
Bücker, H.M., Sauren, M.: Reducing global synchronization in the biconjugate gradient method. In: Yang, T. (ed.) Parallel numerical computations with applications, pp. 63–76. Kluwer Academic Publishers, Norwell (1999)
Fischer, B., Freund, R.: An inner product-free conjugate gradient-like algorithm for Hermitian positive definite systems. In: Brown, J., et al. (eds.) Proc. Cornelius Lanczos Intern. Centenary Conf., pp. 288–290. SIAM (1994)
Meurant, G.: The conjugate gradient method on supercomputers. Supercomputer 13, 9–17 (1986)
Van Rosendale, J.: Minimizing inner product data dependencies in conjugate gradient iteration. NASA Contractor Report NASA–CR–172178, NASA Langley Research Center, Center, Hampton, VA (1983)
Bücker, H.M., Sauren, M.: A Variant of the Biconjugate Gradient Method Suitable for Massively Parallel Computing. In: Bilardi, G., Ferreira, A., Lüling, R., Rolim, J. (eds.) IRREGULAR 1997. LNCS, vol. 1253, pp. 72–79. Springer, Heidelberg (1997)
Bücker, H.M., Sauren, M.: A Parallel Version of the Quasi-Minimal Residual Method Based on Coupled Two-Term Recurrences. In: Waśniewski, J., Dongarra, J., Madsen, K., Olesen, D. (eds.) PARA 1996. LNCS, vol. 1184, pp. 157–165. Springer, Heidelberg (1996)
Chronopoulos, A.T.: A Class of Parallel Iterative Methods Implemented on Multiprocessors. Technical report UIUCDCS–R–86–1267, Department of Computer Science, University of Illinois, Urbana, Illinois (1986)
Chronopoulos, A.T., Gear, C.W.: S-step iterative methods for symmetric linear systems. J. Comput. Appl. Math. 25(2), 153–168 (1989)
Chronopoulos, A.T., Swanson, C.D.: Parallel iterative s-step methods for unsymmetric linear systems. Parallel Computing 22(5), 623–641 (1996)
Kim, S.K., Chronopoulos, A.: A class of Lanczos-like algorithms implemented on parallel computers. Parallel Computing 17(6-7), 763–778 (1991)
Kim, S.K., Chronopoulos, A.T.: An efficient nonsymmetric Lanczos method on parallel vector computers. J. Comput. Appl. Math. 42(3), 357–374 (1992)
Alvarez-Dios, J.A., Cabaleiro, J.C., Casal, G.: A generalization of s-step variants of gradient methods. J. Comput. Appl. Math. 236(12), 2938–2953 (2012)
Mohiyuddin, M., Hoemmen, M., Demmel, J., Yelick, K.: Minimizing communication in sparse matrix solvers. In: Proc. Conf. High Perf. Comput. Networking, Storage and Analysis, SC 2009, pp. 36:1–36:12. ACM, New York (2009)
Hoemmen, M.F.: Communication-avoiding Krylov subspace methods. PhD thesis, EECS Department, University of California, Berkeley (2010)
Carson, E., Knight, N., Demmel, J.: Avoiding communication in two-sided Krylov subspace methods. SIAM J. Sci. Comput. 35(5), S42–S61 (2013)
Ghysels, P., Ashby, T.J., Meerbergen, K., Vanroose, W.: Hiding global communication latency in the GMRES algorithm on massively parallel machines. SIAM J. Sci. Comput. 35(1), 48–71 (2013)
Ghysels, P., Vanroose, W.: Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. Parallel Computing (in press, 2013)
Curfmann McInnes, L., Smith, B., Zhang, H., Mills, R.T.: Hierarchical and nested Krylov methods for extreme-scale computing. Parallel Computing (in press, 2013)
Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Nat. Bur. Stand. 45(4), 255–282 (1950)
Feuerriegel, S., Bücker, H.M.: A normalization scheme for the non-symmetric s-Step Lanczos algorithm. In: Kołodziej, J., Aversa, R., Zhang, J., Amato, F., Fortino, G. (eds.) ICA3PP 2013, Part II. LNCS, vol. 8286, pp. 30–39. Springer, Heidelberg (2013)
Freund, R., Nachtigal, N.: QMR: a quasi-minimal residual method for non-Hermitian linear systems. Num. Math. 60(1), 315–339 (1991)
Sauren, M., Bücker, H.M.: On deriving the quasi-minimal residual method. SIAM Review 40(4), 922–926 (1998)
van der Vorst, H.A., Ye, Q.: Residual replacement strategies for Krylov subspace iterative methods for the convergence of true residuals. SIAM J. Sci. Comput. 22(3), 835–852 (2000)
Carson, E., Demmel, J.: A residual replacement strategy for improving the maximum attainable accuracy of s-step Krylov subspace methods. Technical Report UCB/EECS–2012–197, University of California, Berkeley (2012)
Gustafsson, M., Demmel, J., Holmgren, S.: Numerical evaluation of the communication-avoiding Lanczos algorithm. Technical Report 2012–001, Department of Information Technology, Uppsala University (January 2012)
Freund, R.W., Hochbruck, M.: A biconjugate gradient type algorithm on massively parallel architectures. In: Vichnevetsky, R., Miller, J.J.H. (eds.) IMACS 1991 Proc. 13th World Congress Comput. Appl. Math, pp. 720–721. Criterion Press, Dublin (1991)
Freund, R.W., Hochbruck, M.: A biconjugate gradient-type algorithm for the iterative solution of non-Hermitian linear systems on massively parallel architectures. In: Brezinski, C., Kulisch, U. (eds.) IMACS 1991, Proc. 13th World Congress Comput. Appl. Math. I, pp. 169–178. Elsevier Science Publishers (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Feuerriegel, S., Bücker, H.M. (2013). Synchronization-Reducing Variants of the Biconjugate Gradient and the Quasi-Minimal Residual Methods. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-03859-9_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03858-2
Online ISBN: 978-3-319-03859-9
eBook Packages: Computer ScienceComputer Science (R0)