On Using an Hybrid MPI-Thread Programming for the Implementation of a Parallel Sparse Direct Solver on a Network of SMP Nodes

  • Pascal Hénon
  • Pierre Ramet
  • Jean Roman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3911)


Since the last decade, most of the supercomputer architectures are based on clusters of SMP nodes. In those architectures the exchanges between processors are made through shared memory when the processors are located on a same SMP node and through the network otherwise. Generally, the MPI implementations provided by the constructor on those machines are adapted to this situation and take advantage of the shared memory to treat messages between processors in a same SMP node. Nevertheless, this transparent approach to exploit shared memory does not avoid the storage of the extra-structures needed to manage efficiently the communications between processors. For high performance parallel direct solvers, the storage of these extra-structures can become a bottleneck. In this paper, we propose an hybrid MPI-thread implementation of a parallel direct solver and analyse the benefits of this approach in terms of memory and run-time performances.


Shared Memory Dimensional Distribution Direct Solver Memory Overhead Local Allocation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gupta, A., Joshi, M., Kumar, V.: Wsmp: A high-performance shared- and distributed-memory parallel sparse linear equation solver. Report, University of Minnesota and IBM Thomas J. Watson Research Center (2001)Google Scholar
  2. 2.
    Hénon, P., Ramet, P., Roman, J.: A Mapping and Scheduling Algorithm for Parallel Sparse Fan-In Numerical Factorization. In: Amestoy, P.R., Berger, P., Daydé, M., Duff, I.S., Frayssé, V., Giraud, L., Ruiz, D. (eds.) Euro-Par 1999. LNCS, vol. 1685, pp. 1059–1067. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  3. 3.
    Hénon, P., Ramet, P., Roman, J.: PaStiX: A Parallel Sparse Direct Solver Based on a Static Scheduling for Mixed 1D/2D Block Distributions. In: Rolim, J.D.P. (ed.) IPDPS-WS 2000. LNCS, vol. 1800, pp. 519–525. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Hénon, P., Ramet, P., Roman, J.: PaStiX: A High-Performance Parallel Direct Solver for Sparse Symmetric Definite Systems. Parallel Computing 28(2), 301–321 (2002)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Hénon, P., Ramet, P., Roman, J.: Efficient algorithms for direct resolution of large sparse system on clusters of SMP nodes. In: SIAM Conference on Applied Linear Algebra, Williamsburg, Virginie, USA (July 2003)Google Scholar
  6. 6.
    Joshi, M., Karypis, G., Kumar, V., Gupta, A., Gustavson, F.: PSPASES: Scalable Parallel Direct Solver Library for Sparse Symmetric Positive Definite Linear Systems. Technical report, University of Minnesota and IBM Thomas J. Watson Research Center (May 1999)Google Scholar
  7. 7.
    Li, X.S.: Sparse Gaussian Elimination on High Performance Computers. PhD thesis, University of California, Berkeley (1996)Google Scholar
  8. 8.
    Pellegrini, F., Roman, J., Amestoy, P.: Hybridizing nested dissection and halo approximate minimum degree for efficient sparse matrix ordering. Concurrency: Practice and Experience, 12, 69–84 (2000) Preliminary version published in Proceedings of Irregular 1999, LNCS, vol. 1586, pp. 986–995 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Pascal Hénon
    • 1
  • Pierre Ramet
    • 1
  • Jean Roman
    • 1
  1. 1.LaBRI, UMR CNRS 5800 & ENSEIRBINRIA Futurs ScAlApplix ProjectTalenceFrance

Personalised recommendations