Abstract
This paper presents a parallelized iterative solver for large sparse linear systems implemented on a heterogeneous platform. Traditionally, these problems do not scale well on multi-CPU/multi-GPUs clusters. We consider the standard preconditioned Conjugate Gradient (PCG) algorithm, and as an alternative the pipelined variant, a formulation that is potentially better suited for hybrid CPU/GPU computing since it requires only one synchronization point per iteration, instead of two for standard CG. On heterogeneous cluster, the PCG iteration needs the vector entries generated by current GPU and other GPUs, so the communication between GPUs becomes a major performance bottleneck. In this paper, we study the implementation of the pipeline PCG on multi-CPU/multi-GPU platform. This paper presents an approach to reduce the communications between cluster compute nodes for these solvers. Additionally, computation and communication are overlapped to reduce the impact of data exchange. To achieve scalability, we adopt pipelined version of the conjugate gradient method with one synchronization point, the possibility of asynchronous calculations, load balancing between the CPU and GPU for parallel solving the large linear systems. The algorithm is implemented with the combined use of technologies: MPI, OpenMP and CUDA. We show that almost optimum speed up on 8-CPU/2GPU may be reached (relatively to a one GPU execution). The parallelized solver achieves a speedup of up to 5.49 times on 16 NVIDIA Tesla GPUs, as compared to a one GPU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bosner, N., Bujanovi, Z., Drma, Z.: Parallel solver for shifted systems in a hybrid CPU–GPU framework. SIAM Journal on Scientific Computing 40(4), C605–C633 (2018)
Chronopoulos, A., Gear, C.: s-step iterative methods for symmetric linear systems. Journal of Computational and Applied Mathematics 25(2), 153–168 (1989)
Collignon, T., Gijzen, M.V.: Two implementations of the preconditioned conjugate gradient method on heterogeneous computing grids. International Journal of Applied Mathematics and Computer Science 20(1), 109–121 (01 Mar 2010)
Davis, T.A., Hu, Y.: The university of florida sparse matrix collection 38(1) (2011)
Dongarra, J., Heroux, M.A., Luszczek, P.: A new metric for ranking high-performance computing systems. National Science Review 3(1), 30–35 (01 2016)
Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. Journal of research of the National Bureau of Standards 49, 409–436 (1952)
Jamal, A., Baboulin, M., Khabou, A., Sosonkina, M.: A hybrid CPU/GPU approach for the Parallel Algebraic Recursive Multilevel Solver pARMS. In: 2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). pp. 411–416 (2016)
Kasmi, N., Zbakh, M., Haouari, A.: Performance analysis of preconditioned conjugate gradient solver on heterogeneous (multi-CPUs/multi-GPUs) architecture. Lecture Notes in Networks and Systems 49, 318–336 (2019).
Kopysov, S., Kuzmin, I., Nedozhogin, N., Novikov, A., Sagdeeva, Y.: Scalable hybrid implementation of the schur complement method for multi-gpu systems. Journal of Supercomputing 69(1), 81–88 (2014)
Mittal, S., Vetter, J.S.: A survey of cpu-gpu heterogeneous computing techniques. ACM Comput. Surv. 47(4) (Jul 2015).
Kadyrov, I.R., Kopysov, S.P., Novikov, A.K.: Partitioning of triangulated multiply connected domain into subdomains without branching of inner boundaries. Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki 160(3), 544–560 (2018)
Zhang, X., Yang, C., Liu, F., Liu, Y., Lu, Y.: Optimizing and scaling HPCG on Tianhe-2: early experience, vol. 8631, part I, p. 28–41. Springer, Dalian, China (aug 2014),
Acknowledgements
The work was carried out with the financial support of Udmurt State University in the contest of the grants “Scientific Potential”, project No. 2020-04-03.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nedozhogin, N.S., Kopysov, S.P., Novikov, A.K. (2022). Scalability Pipelined Algorithm of the Conjugate Gradient Method on Heterogeneous Platforms. In: Badriev, I.B., Banderov, V., Lapin, S.A. (eds) Mesh Methods for Boundary-Value Problems and Applications. Lecture Notes in Computational Science and Engineering, vol 141. Springer, Cham. https://doi.org/10.1007/978-3-030-87809-2_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-87809-2_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87808-5
Online ISBN: 978-3-030-87809-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)