Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations

Hoefler, Torsten; Gottschling, Peter; Rehm, Wolfgang; Lumsdaine, Andrew

doi:10.1007/11846802_52

Torsten Hoefler^20,21,
Peter Gottschling²⁰,
Wolfgang Rehm²¹ &
…
Andrew Lumsdaine²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 4192))

Included in the following conference series:

European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting

1188 Accesses
4 Citations

Abstract

This paper presents a case study about the applicability and usage of non-blocking collective operations. These operations provide the ability to overlap communication with computation and to avoid unnecessary synchronization. We introduce our NBC library, a portable low-overhead implementation of non-blocking collectives on top of MPI-1. We demonstrate the easy usage of the NBC library with the optimization of a conjugate gradient solver with only minor changes to the traditional parallel implementation of the program. The optimized solver runs up to 34% faster and is able to overlap most of the communication. We show that there is, due to the overlap, no performance difference between Gigabit Ethernet and InfiniBand^TM for our calculation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Liu, G., Abdelrahman, T.: Computation-communication overlap on network-of-workstation multiprocessors. In: Proc. of the Int’l Conference on Parallel and Distributed Processing Techniques and Applications, pp. 1635–1642 (1998)
Google Scholar
Petrini, F., Kerbyson, D.J., Pakin, S.: The case of the missing supercomputer performance: Achieving optimal performance on the 8, 192 processors of asci q. In: Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, Phoenix, AZ, USA, CD-Rom, 15-21 November 2003, vol. 55, ACM, New York (2003)
Google Scholar
Hoefler, T., Mehlan, T., Mietke, F., Rehm, W.: Adding Low-Cost Hardware Barrier Support to Small Commodity Clusters. In: ARCS 2006, pp. 343–350 (2006)
Google Scholar
Liu, J., Mamidala, A., Panda, D.: Fast and scalable mpi-level broadcast using infiniband’s hardware multicast support (2003)
Google Scholar
Gorlatch, S.: Send-receive considered harmful: Myths and realities of message passing. ACM Trans. Program. Lang. Syst. 26(1), 47–56 (2004)
Article Google Scholar
Hoefler, T., Squyres, J., Rehm, W., Lumsdaine, A.: A Case for non Blocking Collective Operations, submitted to ISPA - (2006), preprint available at: http://www.unixer.de/sec/nbcoll.pdf
Message Passing Interface Forum: MPI-2 Journal of Development (1997)
Google Scholar
Kanevsky, A., Skjellum, A., Rounbehler, A.: MPI/RT - an emerging standard for high-performance real-time systems. In: HICSS, (3), pp. 157–166 (1998)
Google Scholar
Kale, L.V., Kumar, S., Vardarajan, K.: A Framework for Collective Personalized Communication. In: Proceedings of IPDPS 2003, Nice, France (2003)
Google Scholar
MPICH2 Developers (2006), http://www-unix.mcs.anl.gov/mpi/mpich2/
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. In: Proceedings, 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary (2004)
Google Scholar
Hoefler, T., Squyres, J.M., Bosilca, G., Fagg, G.: Non Blocking Collective Operations for MPI-2 (2006), preprint available at: http://www.unixer.de/sec/standard_nbcoll.pdf
Hackbusch, W.: Iterative solultion of large sparse systems of equations. Springer, Heidelberg (1994)
Google Scholar
Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49, 409–436 (1952)
MATH MathSciNet Google Scholar
Gottschling, P., Nagel, W.E.: An efficient parallel linear solver with a cascadic conjugate gradient method: Experience with reality. In: Bode, A., Ludwig, T., Karl, W.C., Wismüller, R. (eds.) Euro-Par 2000. LNCS, vol. 1900, p. 784. Springer, Heidelberg (2000)
Chapter Google Scholar
Trottenberg, U., Oosterlee, C., Schüller, A.: Multigrid. Academic Press, London (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Open Systems Lab, Indiana University, Bloomington, IN, 47404, USA
Torsten Hoefler, Peter Gottschling & Andrew Lumsdaine
Department of Computer Science, Technical University of Chemnitz, 09107, Chemnitz, Germany
Torsten Hoefler & Wolfgang Rehm

Authors

Torsten Hoefler
View author publications
You can also search for this author in PubMed Google Scholar
Peter Gottschling
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Rehm
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Lumsdaine
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Forschungszentrum Jülich, ZAM, 52425, Jülich, Germany
Bernd Mohr
NEC Europe Ltd., NEC Laboratories Europe, Rathausallee 10, D-53757, Sankt Augustin, Germany
Jesper Larsson Träff
Dolphin Interconnect Solutions ASA R&D Germany, Siebengebirgsblick 26, 53343, Wachtberg, Germany
Joachim Worringen
Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hoefler, T., Gottschling, P., Rehm, W., Lumsdaine, A. (2006). Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2006. Lecture Notes in Computer Science, vol 4192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846802_52

Download citation

DOI: https://doi.org/10.1007/11846802_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39110-4
Online ISBN: 978-3-540-39112-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics