Skip to main content

Resource-Efficient Parallel CG Algorithms for Linear Systems Solving on Heterogeneous Platforms

  • Conference paper
  • First Online:
Supercomputing (RuSCDays 2020)

Abstract

The article discusses the parallel implementation of solving systems of linear algebraic equations on the heterogeneous platform containing a central processing unit (CPU) and graphic accelerators (GPU). The performance of parallel algorithms for the classical conjugate gradient method schemes when using the CPU and GPU together is significantly limited by the synchronization points. The article investigates the pipeline version of the conjugate gradient method with one synchronization point, the possibility of asynchronous calculations, load balancing between the CPU and GPU when solving the large linear systems. Numerical experiments were carried out on test matrices and computational nodes of different performance of a heterogeneous platform, which allowed us to estimate the contribution of communication costs. The algorithms are implemented with the combined use of technologies: MPI, OpenMP and CUDA. The proposed algorithms, in addition to reducing the execution time, allow solving large linear systems, for which there are not enough memory resources of one GPU or a computing node. At the same time, block algorithm with the pipelining decreases the total execution time by reducing synchronization points and aggregating some messages in one.

Supported by Russian Foundation for Basic Research (RFBR) according to the research project 17-01-00402. The work was carried out with the financial support of Udmurt State University in the contest of the grants “Scientific Potential”, project No. 2020-04-03.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agullo, E., Giraud, L., Guermouche, A., Roman, J.: Parallel hierarchical hybrid linear solvers for emerging computing platforms. C. R. Mec. 333, 96–103 (2011)

    Article  Google Scholar 

  2. Gaidamour, J., Henon, P.: A parallel direct/iterative solver based on a Schur complement approach. In: IEEE 11th International Conference on Computational Science and Engineering, pp. 98–105. San Paulo (2008)

    Google Scholar 

  3. Giraud, L., Haidar, A., Saad, Y.: Sparse approximations of the Schur complement for parallel algebraic hybrid solvers in 3D. Numer. Math. 3, 276–294 (2010)

    MathSciNet  MATH  Google Scholar 

  4. Rajamanickam, S., Boman, E.G., Heroux, M.A.: ShyLU: a hybrid-hybrid solver for multicore platforms. In: IEEE 26th International Parallel and Distributed Processing Symposium, Shanghai, pp. 631–643 (2012)

    Google Scholar 

  5. Yamazaki, I., Rajamanickam, S., Boman, E., Hoemmen, M., Heroux, M., Tomov, S.: Domain decomposition preconditioners for communication-avoiding Krylov methods on a hybrid CPU/GPU cluster. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC14), pp. 933–944 (2014)

    Google Scholar 

  6. Kopysov, S., Kuzmin, I., Nedozhogin, N., Novikov, A., Sagdeeva, Y.: Scalable hybrid implementation of the Schur complement method for multi-GPU systems. J. Supercomputing 69(1), 81–88 (2014). https://doi.org/10.1007/s11227-014-1209-7

    Article  Google Scholar 

  7. Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stan. 49(6), 409–436 (1952)

    Article  MathSciNet  Google Scholar 

  8. Jamal, A., Baboulin, M., Khabou, A., Sosonkina, M.A.: A hybrid CPU approach GPU for the parallel algebraic recursive multilevel solver pARMS. In: 2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). Timisoara, pp. 411–416 (2016)

    Google Scholar 

  9. Kasmi, N., Zbakh, M., Mahmoudi, S.A., Manneback, P.: Performance evaluation of StarPU schedulers with preconditioned conjugate gradient solver on heterogeneous (Multi-CPUs/Multi-GPUs) architecture. IJCSNS Int. J. Comput. Sci. Netw. Secur. 17, 206–215 (2017)

    Google Scholar 

  10. Cornelis, J., Cools, S., Vanroose, W.: The Communication-Hiding Conjugate Gradient Method with Deep Pipelines. https://arxiv.org/pdf/1801.04728.pdf. Accessed 14 Apr 2020

  11. D’Azevedo, E.F., Romine, C.H.: Reducing communcation costs in the conjugate gradient algorithm on distributed memory multiprocessors. Technical report ORNL/TM-12192, Oak Ridge National Lab (1992)

    Google Scholar 

  12. Linear Algebra. Springer, Singapore (2018). https://doi.org/10.1007/978-981-13-0926-7_7

  13. Chronopoulos, A.T., Gear, C.W.: s-step iterative methods for symmetric linear systems. J. Comput. Appl. Math. 25(2), 153–168 (1989)

    Article  MathSciNet  Google Scholar 

  14. Ghysels, P., Vanroose, W.: Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. Parallel Comput. 40(7), 224–238 (2014)

    Article  MathSciNet  Google Scholar 

  15. Gropp, W.: Update on libraries for blue waters. http://wgropp.cs.illinois.edu/bib-/talks/tdata/2011/Stream-nbcg.pdf. Accessed 14 Apr 2020

  16. Kadyrov, I.R., Kopysov, S.P., Novikov, A.K.: Partitioning of triangulated multiply connected domain into subdomains without branching of inner boundaries. Uchenye Zap. Kazanskogo Univ. Ser. Fiz. Matematicheskie Nauki 160(3), 544–560 (2018)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikita S. Nedozhogin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nedozhogin, N.S., Kopysov, S.P., Novikov, A.K. (2020). Resource-Efficient Parallel CG Algorithms for Linear Systems Solving on Heterogeneous Platforms. In: Voevodin, V., Sobolev, S. (eds) Supercomputing. RuSCDays 2020. Communications in Computer and Information Science, vol 1331. Springer, Cham. https://doi.org/10.1007/978-3-030-64616-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64616-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64615-8

  • Online ISBN: 978-3-030-64616-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics