Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems

Li, Keqin

doi:10.1007/s11227-009-0319-0

Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems

Published: 29 July 2009

Volume 54, pages 271–297, (2010)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Keqin Li¹

99 Accesses
7 Citations
Explore all metrics

Abstract

We present fast and highly scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include matrix multiplication, matrix chain product, and computing the powers, the inverse, the characteristic polynomial, the determinant, the rank, the Krylov matrix, and an LU- and a QR-factorization of a matrix, and solving linear systems of equations. Our highly scalable parallel computations for these problems are based on a highly scalable implementation of the fastest sequential matrix multiplication algorithm on DMS. We show that compared with the best known parallel time complexities on parallel random access machines (PRAM), the most powerful but unrealistic shared memory model of parallel computing, our parallel matrix computations achieve the same speeds on distributed memory parallel computers (DMPC), and have an extra polylog factor in the time complexities on DMS with hypercubic networks. Furthermore, our parallel matrix computations are fully scalable on DMPC and highly scalable over a wide range of system size on DMS with hypercubic networks. Such fast (in terms of parallel time complexity) and highly scalable (in terms of our definition of scalability) parallel matrix computations were rarely seen before on any distributed memory systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arabnia HR (1993) A transputer-based reconfigurable parallel system. In: Atkins S, Wagner AS (eds) Transputer research and applications (NATUG 6), Vancouver, Canada. IOS Press, Amsterdam, pp 153–169
Google Scholar
Arif Wani M, Arabnia HR (2003) Parallel edge–region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43–62
Article MATH Google Scholar
Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor—theoretical properties and algorithms. Parallel Comput 21(11):1783–1805
Article Google Scholar
Bini D, Pan V (1994) Polynomial and matrix computations, vol 1, fundamental algorithms. Birkhäuser, Boston
Google Scholar
Coppersmith D, Winograd S (1990) Matrix multiplication via arithmetic progressions. J Symb Comput 9:251–280
Article MATH MathSciNet Google Scholar
Csanky L (1976) Fast parallel matrix inversion algorithms. SIAM J Comput 5:618–623
Article MATH MathSciNet Google Scholar
Dekel E, Nassimi D, Sahni S (1981) Parallel matrix and graph algorithms. SIAM J Comput 10:657–673
Article MATH MathSciNet Google Scholar
Eshaghian MM (1993) Parallel algorithms for image processing on OMC. IEEE Trans Comput 40:827–833
Article Google Scholar
Goldberg LA, Jerrum M, Leighton T, Rao S (1997) Doubly logarithmic communication algorithms for optical-communication parallel computers. SIAM J Comput 26:1100–1119
Article MATH MathSciNet Google Scholar
Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing, 2nd edn. Addison-Wesley, Harlow
Google Scholar
Ibarra OH, Moran S, Rosier LE (1980) A note on the parallel complexity of computing the rank of order n matrices. Inf Process Lett 11(4–5):162
Article MathSciNet Google Scholar
Le Verrier UJJ (1840) Sur les variations seculaires des elementes elliptiques des sept planets principales. J Math Pures Appl 5:220–254
Google Scholar
Leighton FT (1992) Introduction to parallel algorithms and architectures: arrays, trees, hypercubes. Morgan Kaufmann, San Mateo
MATH Google Scholar
Li K (2001) Scalable parallel matrix multiplication on distributed memory parallel computers. J Parallel Distrib Comput 61(12):1709–1731
Article MATH Google Scholar
Li K (2004) Fast and scalable parallel matrix computations with reconfigurable pipelined optical buses. Parallel Algorithms Appl 19(4):195–209
MathSciNet Google Scholar
Li K (2007) Analysis of parallel algorithms for matrix chain product and matrix powers on distributed memory systems. IEEE Trans Parallel Distrib Syst 18(7):865–878
Article Google Scholar
Li K (2008) Fast and scalable parallel matrix multiplication and its applications on distributed memory systems. In: Rajasekaran S, Reif J (eds) Parallel computing: models, algorithms, and applications. CRC Press, Boca Raton, Chap 47
Google Scholar
Li K, Pan VY (2001) Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system. IEEE Trans Comput 50(5):519–525
Article MathSciNet Google Scholar
Li K, Pan Y, Zheng SQ (1998) Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system. IEEE Trans Parallel Distrib Syst 9(8):705–720
Article Google Scholar
Mehlhorn K, Vishkin U (1984) Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Inf 21:339–374
MATH MathSciNet Google Scholar
Pan V (1987) Complexity of parallel matrix computations. Theor Comput Sci 54:65–85
Article MATH Google Scholar
Pan V, Reif J (1985) Efficient parallel solution of linear systems. In: Proceedings of 7th ACM symposium on theory of computing, May 1985, pp 143–152
Pan Y, Li K (1998) Linear array with a reconfigurable pipelined bus system—concepts and applications. J Inf Sci 106(3–4):237–258
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, State University of New York, New Paltz, New York, 12561, USA
Keqin Li

Authors

Keqin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Keqin Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, K. Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems. J Supercomput 54, 271–297 (2010). https://doi.org/10.1007/s11227-009-0319-0

Download citation

Received: 14 October 2008
Accepted: 13 July 2009
Published: 29 July 2009
Issue Date: December 2010
DOI: https://doi.org/10.1007/s11227-009-0319-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems

Abstract

Access this article

Similar content being viewed by others

Distributed generic approximate sparse inverses

Scalable parallel graph algorithms with matrix–vector multiplication evaluated with queries

A Blackbox Polynomial System Solver on Parallel Shared Memory Computers

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems

Abstract

Access this article

Similar content being viewed by others

Distributed generic approximate sparse inverses

Scalable parallel graph algorithms with matrix–vector multiplication evaluated with queries

A Blackbox Polynomial System Solver on Parallel Shared Memory Computers

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation