Skip to main content
Log in

Efficient Sparse LU Factorization with Left-Right Looking Strategy on Shared Memory Multiprocessors

  • Published:
BIT Numerical Mathematics Aims and scope Submit manuscript

Abstract

An efficient sparse LU factorization algorithm on popular shared memory multi-processors is presented. Pipelining parallelism is essential to achieve higher parallel efficiency and it is exploited with a left-right looking algorithm. No global barrier is used and a completely asynchronous scheduling scheme is one central point of the implementation. The algorithm has been successfully tested on SUN Enterprise, DEC AlphaServer, SGI Origin 2000 and Cray T90 and J90 parallel computers, delivering up to 2.3 GFlop/s on an eight processor DEC AlphaServer for medium-size semiconductor device simulations and structural engineering problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. P. Amestoy and I. Duff, Memory management issues in sparse multifrontal methods on multiprocessors, Internat. J. Supercomputer Appl., 7 (1993), pp. 64–82.

    Google Scholar 

  2. P. Amestoy, I. Duff, and J. L'Excellent, Multifrontal parallel distributed symmetric and unsymmetric solvers, Comput. Methods Appl. Mech. Engrg., to appear.

  3. C. Ashcraft, R. Grimes, J. Lewis, B. Peyton, and H. Simon, Progress in sparse matrix methods for large linear systems on vector supercomputers, Internat. J. Supercomputer Appl., 1 (1987), pp. 10–30.

    Google Scholar 

  4. S. Barnard and H. Simon, A fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems, Tech. Rep. RNR–92–033, NASA Ames Research Center, 1994.

  5. J. Bunch and L. Kaufman, Some stable methods for calculating inertia and solving symmetric linear systems, Math. Comp., 31 (1977), pp. 162–179.

    Google Scholar 

  6. M. J. Dayd´e and I. S. Duff, Level 3 BLAS in LU factorization on the CRAY-2, ETA-10P, and IBM 3090–200/VF, Int. J. Supercomputer Appl., 3 (1989), pp. 40–70.

    Google Scholar 

  7. J. Demmel, J. Gilbert, and X. Li., An asynchronous parallel supernodal algorithm for sparse Gaussian elimination, Tech. Report CSD–97–943, Computer Science Division, University of California, Berkeley, CA, 1997.

  8. J. Dongarra., Performance of various computers using standard linear equations software, (Linpack benchmark report), Tech. Rep.CS–89–85, Department of Computer Science, University of Tennessee, Knoxville, TN, 1998.

  9. J. Dongarra and J. Demmel, LAPACK: A portable high-performance numerical library for linear algebra, Supercomput., 8 (1991), pp. 33–38.

    Google Scholar 

  10. J. Dongarra, J. DuCroz, I. Duff, and S. Hammarling, A set of level 3 basic linear algebra subprograms, ACM Trans. Math. Software, 16 (1990), pp. 1–28.

    Google Scholar 

  11. I. Duff, A. Erisman, and J. K. Reid, Direct Methods for Sparse Matrices, Oxford University Press, London, 1986.

    Google Scholar 

  12. I. Duff, R. Grimes, and J. Lewis, Users' guide for the Harwell-Boeing sparse matrix collection, release 1, Tech. Report, RAL–92–086, Rutherford Appleton Laboratory, Didcot, UK, 1992.

  13. I. S. Duff, Multiprocessing a sparse matrix code on the Alliant FX/8, J. Comput. Appl. Math., 27 (1989), pp. 229–239.

    Google Scholar 

  14. C. Fu, X. Jiao, and T. Yang, Efficient sparse LU factorization with partial pivoting on distributed memory architectures, IEEE Trans. Parallel Distrib. Systems, 9 (1998), pp. 109–125.

    Google Scholar 

  15. A. Gupta, F. Gustavson, M. Joshi, G. Karypis, and V. Kumar, Design and implementation of a scalable parallel direct solver for sparse symmetric positive definite systems, in Proceedings of the Eighth SIAM Conference on Parallel Processing, SIAM, Philadelphia, PA, 1997.

  16. A. Gupta, G. Karypis, and V. Kumar, Highly scalable parallel algorithms for sparse matrix factorization, IEEE Trans. Parallel Distrib. Systems, 8 (1997), pp. 502–520.

    Google Scholar 

  17. A. H¨ofler, Development and Application of a Model Hierchary for Silicon Process Simulation, PhD Thesis, ETH, Z¨urich, 1997.

  18. Integrated Systems Engineering AG, DESSIS— ISE Reference Manual, ISE Integrated Systems Engineering AG, 1998.

  19. Integrated Systems Engineering AG, DIOS— ISE Reference Manual, ISE Integrated Systems Engineering AG, 1998.

  20. G. Karypis and V. Kumar, Analysis of multilevel graph algorithms, Tech. Report MN 95–037, Department of Computer Science, University of Minnesota, Minneapolis, MN, 1995.

  21. G. Karypis and V. Kumar, Multilevel algorithms for multi-constraint graph partitioning, Tech. Report MN 98–019, Department of Computer Science, University of Minnesota, Minneapolis, MN, 1998.

  22. G. Karypis and V. Kumar, ParMETIS: Parallel graph partitioning library, Tech. Report, University of Minnesota, Department of Computer Science. Available via URL http://www-users.cs.umn.edu/~metis/, Sept. 1998.

  23. R. M. L. Dagnum, OpenMP: An industry-standard API for shared-memory programming, IEEE Comput. Science Engrg., 1 (1998), pp. 46–55.

    Google Scholar 

  24. X. Li and J. Demmel, Making sparse Gaussian elimination scalable by static pivoting, in Proceedings of the Supercomputing 98, Nov. 1998, ACM.

  25. X. S. Li, Sparse Gaussian Elimination on High Performance Computers, PhD Thesis, UCB//CSD–96–919, Computer Science Division, University of California, Berkeley, CA, 1997.

  26. A. Liegmann, Efficient Solution of Large Sparse Linear Systems, PhD Thesis, ETH Z¨urich, 1995.

  27. J. W.-H. Liu, Modification of the minimum-degree algorithm by multiple elimination, ACM Trans. Math. Software, 11 (1985), pp. 141–153.

    Google Scholar 

  28. J. W.-H. Liu, The role of elimination trees in sparse factorization, SIAMJ.Matrix Anal. Appl., 11 (1990), pp. 134–172.

    Google Scholar 

  29. P. Matstoms, Parallel sparse QR factorization on shared memory architectures, Parallel Computing, 21 (1995), pp. 473–486.

    Google Scholar 

  30. E. Ng, Parallel direct solution of sparse linear systems, in Parallel Supercomputing: Methods, Algorithms and Applications, G. F. Carey, ed., John Wiley, Chichester, UK, 1989, pp. 157–176.

    Google Scholar 

  31. E. Ng and B. Peyton, A supernodal Cholesky factorization algorithm for sharedmemory multiprocessors, SIAM J. Sci. Comput., 14 (1993), pp. 761–769.

    Google Scholar 

  32. E. Rothberg, Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers, SIAM J. Sci. Comput., 17 (1996), pp. 699–711.

    Google Scholar 

  33. O. Schenk, K. G¨artner, and W. Fichtner, Scalable parallel sparse factorization with left-right looking strategy on shared memory multiprocessors, in High Performance Computing and Networking, Proceedings of 7th International Conference, HPCN Europe 1999, Amsterdam, P. Sloot, M. Bubak, A. Hoekstra, and B. Hertzberger, eds., Lecture Notes in Computational Science Vol. 1593, Springer-Verlag, Berlin, 1999.

    Google Scholar 

  34. O. Schenk, K. G¨artner, and W. Fichtner, A parallel sparse direct solver for large structurally symmetric linear systems and parallel multigrid methods for the continuity equations in semiconductor device simulation, Tech. Report 97/17, Integrated Systems Laboratory, ETH, Z¨urich, Switzerland, 1997.

  35. S. Sze, Semiconductor Devices, Physics and Technology, John Wiley, Chichester, UK, 1985.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schenk, O., Gärtner, K. & Fichtner, W. Efficient Sparse LU Factorization with Left-Right Looking Strategy on Shared Memory Multiprocessors. BIT Numerical Mathematics 40, 158–176 (2000). https://doi.org/10.1023/A:1022326604210

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1022326604210

Navigation