Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters

Ziane Khodja, Lilia; Couturier, Raphaël; Giersch, Arnaud; Bahi, Jacques M.

doi:10.1007/s11227-014-1143-8

Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters

Published: 07 March 2014

Volume 69, pages 200–224, (2014)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Lilia Ziane Khodja¹,
Raphaël Couturier¹,
Arnaud Giersch¹ &
…
Jacques M. Bahi¹

713 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we aim at exploiting the power computing of a graphics processing unit (GPU) cluster for solving large sparse linear systems. We implement the parallel algorithm of the generalized minimal residual iterative method using the Compute Unified Device Architecture programming language and the MPI parallel environment. The experiments show that a GPU cluster is more efficient than a CPU cluster. In order to optimize the performances, we use a compressed storage format for the sparse vectors and the hypergraph partitioning. These solutions improve the spatial and temporal localization of the shared data between the computing nodes of the GPU cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallel version of GPBi-CG method suitable for distributed parallel computing

Article 31 December 2014

A Jacobi_PCG solver for sparse linear systems on multi-GPU cluster

Article 11 October 2016

Parallel Sorted Sparse Approximate Inverse Preconditioning Algorithm on GPU

References

Ament M, Knittel G, Weiskopf D, Strasser W (2010) A parallel preconditioned conjugate gradient solver for the poisson problem on a multi-GPU platform. In: Proceedings of the 2010 18th Euromicro conference on parallel, distributed and network-based processing, IEEE Computer Society, pp 583–592
Arnoldi W (1951) The principle of minimized iteration in the solution of the matrix eigenvalue problem. Quart Appl Math 9:17–29
MATH MathSciNet Google Scholar
Bahi J, Contassot-Vivier S, Couturier R (2008) Parallel iterative algorithms: from sequential to grid computing. In: Numerical analysis and scientific computing. Chapman & Hall/CRC
Bahi J, Couturier R, Ziane Khodja L (2011) Parallel GMRES implementation for solving sparse linear systems on GPU clusters. In: Proceedings of the 19th high performance computing symposia, HPC ’11, SCS, International, pp 12–19
Bahi J, Couturier R, Ziane Khodja L (2012) Parallel sparse linear solver gmres for gpu clusters with compression of exchanged data. In: Euro-Par 2011: parallel processing workshops, volume 7155 of LNCS, Springer, pp 471–480
Bell N, Garland M (2009) Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: SC’09, Portland, Oregon, ACM, pp 1–11
Bolz J, Farmer I, Grinspun E, Schröder P (2003) Sparse matrix solvers on the GPU: conjugate gradients and multigrid. ACM Trans Graph 22(3):917–924
Article Google Scholar
Çatalyürek Ü, Aykanat C (1999) Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans Parallel Distrib Syst 10(7):673–693
Google Scholar
Çatalyürek Ü, Aykanat C (1999) PaToH: partitioning tool for hypergraphs. http://bmi.osu.edu/~umit/PaToH/manual.pdf. Accessed 28 Feb 2014
Cevahir A, Nukada A, Matsuoka S (2009) Fast conjugate gradients with multiple GPUs. In: Computational science ICCS 2009, volume 5544 of LNCS, Springer, pp 893–903
Cevahir A, Nukada A, Matsuoka S (2010) High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning. Comput Sci Res Dev 25:83–91
Article Google Scholar
Chen C, Taha T (2013) A communication reduction approach to iteratively solve large sparse linear systems on a GPGPU cluster. Cluster Comput 1–11
Contassot-Vivier S, Jost T, Vialle S (2012) Impact of asynchronism on GPU accelerated parallel iterative computations. In: Applied parallel and scientific computing, volume 7133 of LNCS, Springer, pp 43–53
Couturier R, Domas S (2012) Sparse systems solving on GPUs with GMRES. J Supercomput 59(3):1504–1516
Article Google Scholar
CUSP Library. http://cusplibrary.github.io/. Accessed 28 Feb 2014
Davis T, Hu Y (1997) The University of Florida sparse matrix collection, Digest. http://www.cise.ufl.edu/research/sparse/matrices/. Accessed 28 Feb 2014
Devine K, Boman E, Heaphy R, Bisseling R, Çatalyürek Ü (2006) Parallel hypergraph partitioning for scientific computing. In: Proceedings of the 20th international conference on parallel and distributed processing, IPDPS’06, IEEE Computer Society, pp 124–124
DeVries B, Iannelli J, Trefftz C, O’Hearn K, Wolffe G (2013) Parallel implementations of FGMRES for solving large, sparse non-symmetric linear systems. Proc Comput Sci 18:491–500
Article Google Scholar
Gaikwad A, Toke I (2010) Parallel iterative linear solvers on GPU: a financial engineering case. In: Proceedings of the 2010 18th Euromicro conference on parallel, distributed and network-based processing, IEEE Computer Society, pp 607–614
Ghaemian N, Abdollahzadeh A, Heinemann Z, Harrer A, Sharifi M, Heinemann G (2008) Accelerating the GMRES iterative linear solver of an oil reservoir simulator using the multi-processing power of compute unified device architecture of graphics cards. In: PARA 2008
Göddeke D, Strzodka R, Mohd-Yusof J, McCormick P, Buijssen S, Grajewski M, Turek S (2007) Exploring weak scalability for FEM calculations on a GPU-enhanced cluster. Parallel Comput Spec Issue High-perform Comput Accel 33(10–11):685–699
Google Scholar
Haase G, Liebmann M, Douglas C, Plank G (2010) A parallel algebraic multigrid solver on graphics processing units. In: High performance computing and applications, volume 5938 of LNCS, Springer, pp 38–47
Jost T, Contassot-Vivier S, Vialle S (2009) An efficient multi-algorithms sparse linear solver for GPUs. In International conference on parallel computing, ParCo2009
Karypis G, Kumar V (1998) hMETIS: a hypergraph partitioning package. http://glaros.dtc.umn.edu/gkhome/fetch/sw/hmetis/manual.pdf. Accessed 28 Feb 2014
Li R, Saad Y (2013) GPU-accelerated preconditioned iterative linear solvers. J Supercomput 63(2):443–466
Article Google Scholar
Neic A, Liebmann M, Haase G, Plank G (2012) Algebraic multigrid solver on clusters of CPUs and GPUs. In: Applied parallel and scientific computing, volume 7134 of LNCS, Springer, pp 389–398
NVIDIA Corporation (2012) CUDA Toolkit 4.2 CUBLAS Library.
NVIDIA Corporation (2012) NVIDIA CUDA C Programming Guide.
Paige C, Saunders M (1975) Solution of sparse indefinite systems of linear equations. SIAM J Numer Anal 12(4):617–629
Article MATH MathSciNet Google Scholar
PHG—parallel hypergraph and graph partitioning with Zoltan. http://www.cs.sandia.gov/Zoltan/ug_html/ug_alg_phg.html. Accessed 28 Feb 2014
Saad Y, Schultz M (1986) GMRES : a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J Sci Stat Comput 7(3):856–869
Article MATH MathSciNet Google Scholar
Wang M, Klie H, Parashar M, Sudan H (2009) Solving sparse linear systems on NVIDIA Tesla GPUs. In: Computational science ICCS 2009, volume 5544 of LNCS, Springer, pp 864–873
Weber D, Bender J, Schnoes M, Stork A, Fellner D (2013) Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. Comput Graph Forum 32:16–26
Article Google Scholar
Zhao N, Wang X (2012) A parallel preconditioned Bi-Conjugate Gradient stabilized solver for the Poisson problem. J Comput 7(12): 3088–3095
Google Scholar
Zoltan: parallel partitioning, load balancing and data-management services. User’s guide. http://www.cs.sandia.gov/Zoltan/ug_html/ug.html. Accessed 28 Feb 2014

Download references

Acknowledgments

This paper is based upon work supported by the Région de Franche-Comté and partially funded by the Labex ACTION program (contract ANR-11-LABX-01-01).

Author information

Authors and Affiliations

FEMTO-ST Institute, University of Franche-Comte, IUT Belfort-Montbéliard, 19 Av. du Marchal Juin, BP 527, 90016 , Belfort, France
Lilia Ziane Khodja, Raphaël Couturier, Arnaud Giersch & Jacques M. Bahi

Authors

Lilia Ziane Khodja
View author publications
You can also search for this author in PubMed Google Scholar
Raphaël Couturier
View author publications
You can also search for this author in PubMed Google Scholar
Arnaud Giersch
View author publications
You can also search for this author in PubMed Google Scholar
Jacques M. Bahi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raphaël Couturier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ziane Khodja, L., Couturier, R., Giersch, A. et al. Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters. J Supercomput 69, 200–224 (2014). https://doi.org/10.1007/s11227-014-1143-8

Download citation

Published: 07 March 2014
Issue Date: July 2014
DOI: https://doi.org/10.1007/s11227-014-1143-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters

Abstract

Access this article

Similar content being viewed by others

A parallel version of GPBi-CG method suitable for distributed parallel computing

A Jacobi_PCG solver for sparse linear systems on multi-GPU cluster

Parallel Sorted Sparse Approximate Inverse Preconditioning Algorithm on GPU

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters

Abstract

Access this article

Similar content being viewed by others

A parallel version of GPBi-CG method suitable for distributed parallel computing

A Jacobi_PCG solver for sparse linear systems on multi-GPU cluster

Parallel Sorted Sparse Approximate Inverse Preconditioning Algorithm on GPU

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation