Task-Queue Based Hybrid Parallelism: A Case Study

  • Karl Fürlinger
  • Olaf Schenk
  • Michael Hagemann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3149)


In this paper we report on our experiences with hybrid parallelism in PARDISO, a high-performance sparse linear solver. We start with the OpenMP-parallel numerical factorization algorithm and re-organize it using a central dynamic task queue to be able to add message passing functionality. The hybrid version allows the solver to run on a larger number of processors in a cost effective way with very reasonable performance. A speed-up of more than nine running on a four-node quad Itanium 2 SMP cluster is achieved in spite of the fact that a large potential to minimize MPI communication is not yet exploited in the first version of the implementation.


Shared Memory Message Passing Factorization Algorithm Block Column Hybrid Version 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chow, E., Hysom, D.: Assessing performance of hybrid MPI/OpenMP programs on SMP clusters. Technical Report UCRL-JC-143957, Lawrence Livermore National Laboratory, Submitted to J. Parallel and Distributed Computing (May 2001)Google Scholar
  2. 2.
    Henty, D.S.: Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling. In: Proceedings of the 2000 ACM/IEEE conference on Supercomputing, IEEE Computer Society, Los Alamitos (2000)Google Scholar
  3. 3.
    Infiniband Trade Association,
  4. 4.
    Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing 20(1), 359–392 (1998)CrossRefMathSciNetGoogle Scholar
  5. 5.
  6. 6.
    mpip: Lightweight, scalable mpi profiling,
  7. 7.
  8. 8.
    Rabenseifner, R.: Hybrid parallel programming: Performance problems and chances. In: Proc. 45th Cray Users’s Group (CUG) Meeting (May 2003)Google Scholar
  9. 9.
    Schenk, O., Gärtner, K.: Two-level scheduling in PARDISO: Improved scalability on shared memory multiprocessing systems. Parallel Computing 28, 187–197 (2002)zbMATHCrossRefGoogle Scholar
  10. 10.
    Schenk, O., Gärtner, K.: Solving unsymmetric sparse systems of linear equations with PARDISO. Future Generation Computer Systems (2003)Google Scholar
  11. 11.
    Schenk, O., Gärtner, K., Fichtner, W.: Efficient sparse LU factorization with left-right looking strategy on shared memory multiprocessors. BIT 40(1), 158–176 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Smith, L.: Mixed mode MPI/OpenMP programming. Technical Report EH9 3JZ, Edinburgh Parallel Computing Centre (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Karl Fürlinger
    • 1
  • Olaf Schenk
    • 2
  • Michael Hagemann
    • 2
  1. 1.Institut für Informatik, Lehrstuhl für Rechnertechnik und RechnerorganisationTechnische Universität München 
  2. 2.Departement InformatikUniversität Basel 

Personalised recommendations