Impact of Asynchronism on GPU Accelerated Parallel Iterative Computations

  • Sylvain Contassot-Vivier
  • Thomas Jost
  • Stéphane Vialle
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7133)


We study the impact of asynchronism on parallel iterative algorithms in the particular context of local clusters of workstations including GPUs. The application test is a classical PDE problem of advection-diffusion-reaction in 3D. We propose an asynchronous version of a previously developed PDE solver using GPUs for the inner computations. The algorithm is tested with two kinds of clusters, a homogeneous one and a heterogeneous one (with different CPUs and GPUs).


Parallelism GPGPU Asynchronism Scientific computing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amitai, D., Averbuch, A., Israeli, M., Itzikowitz, S.: Implicit-explicit parallel asynchronous solver for PDEs. SIAM J. Sci. Comput. 19, 1366–1404 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Bahi, J., Contassot-Vivier, S., Couturier, R.: Evaluation of the asynchronous iterative algorithms in the context of distant heterogeneous clusters. Parallel Computing 31(5), 439–461 (2005)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bahi, J., Contassot-Vivier, S., Couturier, R.: An Efficient and Robust Decentralized Algorithm for Detecting the Global Convergence in Asynchronous Iterative Algorithms. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds.) VECPAR 2008. LNCS, vol. 5336, pp. 251–264. Springer, Heidelberg (2008)Google Scholar
  4. 4.
    Baudet, G.M.: Asynchronous iterative methods for multiprocessors. J. ACM 25, 226–244 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice Hall, Englewood Cliffs (1989)zbMATHGoogle Scholar
  6. 6.
    Basic linear algebra subprograms,
  7. 7.
    Bojanczyk, A.: Optimal asynchronous newton method for the solution of nonlinear equations. J. ACM 31, 792–803 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Bru, R., Migallon, V., Penadés, J., Szyld, D.B.: Parallel synchronous and asynchronous two-stage multisplitting methods. ETNA 3, 24–38 (1995)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Chazan, D., Miranker, W.: Chaotic relaxation. Linear Algebra Appl. 2, 199–222 (1969)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Cosnard, M., Fraignaud, P.: Analysis of asynchronous polynomial root finding methods on a distributed memory multicomputer. IEEE Trans. on Parallel and Distributed Systems 5(6) (June 1994)Google Scholar
  11. 11.
    Fletcher, R.: Conjugate gradient methods for indefinite systems. In: Watson, G. (ed.) Numerical Analysis. LNM, vol. 506, pp. 73–89. Springer, Heidelberg (1976), doi:10.1007/BFb0080116CrossRefGoogle Scholar
  12. 12.
    Frommer, A., Mayer, G.: On the theory and practice of multisplitting mehods in parallel computation. Computing 49, 63–74 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Frommer, A., Szyld, D.B.: Asynchronous iterations with flexible communication for linear systems. Calculateurs Parallèles, Réseaux et Systèmes Répartis 10, 421–429 (1998)Google Scholar
  14. 14.
    Frommer, A., Szyld, D.B.: On asynchronous iterations. J. Comput. and Appl. Math. 123, 201–216 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Gonzalez, R., Horowitz, M.: Energy dissipation in general pupose microprocessors. IEEE Journal of Solid-State Circuits 31(9) (September 1996)Google Scholar
  16. 16.
    Heroux, M.A.: A proposal for a sparse blas toolkit. SPARKER working note #2. Cray research, Inc. (1992)Google Scholar
  17. 17.
    Hundsdorfer, W., Verwer, J.G.: Numerical Solution of Time-Dependent Advection-Diffusion-Reaction Equations, 1st edn. Springer Series in Computational Mathematics, vol. 33. Springer, Heidelberg (2003)CrossRefzbMATHGoogle Scholar
  18. 18.
    Jost, T., Contassot-Vivier, S., Vialle, S.: An efficient multi-algorithm sparse linear solver for GPUs. In: Parallel Computing: From Multicores and GPU’s to Petascale. Advances in Parallel Computing, vol. 19, pp. 546–553. IOS Press (2010)Google Scholar
  19. 19.
    Jost, T., Contassot-Vivier, S., Vialle, S.: On the interest of clusters of GPUs. In: Grid’5000 Spring School 2010, Lille, France (April 2010)Google Scholar
  20. 20.
    Miellou, J.-C.: Algorithmes de relaxation chaotique à retards. R.A.I.R.O. R 1, 55–82 (1975)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Szyld, D.B., Xu, J.: Convergence of partially asynchronous block quasi-newton methods for nonlinear systems of equations. J. Comp. and Appl. Math. 103, 307–321 (1999)MathSciNetCrossRefGoogle Scholar
  22. 22.
    van der Vorst, H.A.: Bi-cgstab: A fast and smoothly converging variant of bi-cg for the solution of nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 13(2), 631–644 (1992)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Sylvain Contassot-Vivier
    • 1
    • 2
  • Thomas Jost
    • 2
  • Stéphane Vialle
    • 2
    • 3
  1. 1.LoriaUniversity Henri PoincaréNancyFrance
  2. 2.AlGorille INRIA Project TeamFrance
  3. 3.SUPELEC - UMI 2598France

Personalised recommendations