Fast Sparse Matrix-Vector Multiplication for TeraFlop/s Computers
Eigenvalue problems involving very large sparse matrices are common to various fields in science. In general, the numerical core of iterative eigenvalue algorithms is a matrix-vector multiplication (MVM) involving the large sparse matrix. We present three different programming approaches for parallel MVM on present day supercomputers. In addition to a pure message-passing approach, two hybrid parallel implementations are introduced based on simultaneous use of message-passing and shared-memory programming models. For a modern SMP cluster (HITACHI SR8000) performance and scalability of the hybrid implementations are discussed and compared with the pure message-passing approach on massively-parallel systems (CRAY T3E), vector computers (NEC SX5e) and distributed shared-memory systems (SGI Origin3800).
Unable to display preview. Download preview PDF.
- L.A. Smith and P. Kent, Proceedings of the First European Workshop on OpenMP, Lund, Sweden, Sept. 1999, pp. 6–9. 287Google Scholar
- D. S. Henty, Performance of Hybrid Message-Passing and Shared-Memory Parallelism for Discrete Element Modelling. In Proceedings of SC2000, 2000. 287Google Scholar
- H. Shan et al., A Comparison of Three Programming Models for Adaptive Applications on the Origin2000. In Proceedings of SC2000, 2000. 287Google Scholar
- W.D. Gropp et al., Performance Modeling and Tuning of an Unstructured Mesh CF Application. In Proceedings of SC2000, 2000. 287Google Scholar
- R. Rabenseifner, Communication Bandwidth of Parallel Programming Models on Hybrid Architectures. To be published in the proceedings of WOMPEI 2002, Kansai Science City, Japan. LNCS 2327. 287Google Scholar
- J. Dongarra et al., Iterative Solver Benchmark, available at http://www.netlib.org/benchmark/sparsebench/. 287, 289
- R. Barrett et al., Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM, Philadelphia (1994). 288Google Scholar
- M. Kinateder et al., E. Krause and W. Jäger, eds.: High Performance Computing in Science and Engineering 2000, Springer, Berlin (2001), pp. 188–204. 288, 289Google Scholar
- W. Schönauer, Architecture and Use of Shared and Distributed Memory Parallel Computers, eds.: W. Schönauer, ISBN 3-00-005484-7. 296Google Scholar
- G. Wellein et al., Exact Diagonalization of Large Sparse Matrices: A Challenge for Modern Supercomputers, In Proceedings of CUG SUMMIT 2001, CD-ROM. 297Google Scholar
- M. Brehm, LRZ Munich, private communication. 297Google Scholar