Skip to main content
Log in

Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

We present fast and highly scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include matrix multiplication, matrix chain product, and computing the powers, the inverse, the characteristic polynomial, the determinant, the rank, the Krylov matrix, and an LU- and a QR-factorization of a matrix, and solving linear systems of equations. Our highly scalable parallel computations for these problems are based on a highly scalable implementation of the fastest sequential matrix multiplication algorithm on DMS. We show that compared with the best known parallel time complexities on parallel random access machines (PRAM), the most powerful but unrealistic shared memory model of parallel computing, our parallel matrix computations achieve the same speeds on distributed memory parallel computers (DMPC), and have an extra polylog factor in the time complexities on DMS with hypercubic networks. Furthermore, our parallel matrix computations are fully scalable on DMPC and highly scalable over a wide range of system size on DMS with hypercubic networks. Such fast (in terms of parallel time complexity) and highly scalable (in terms of our definition of scalability) parallel matrix computations were rarely seen before on any distributed memory systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Arabnia HR (1993) A transputer-based reconfigurable parallel system. In: Atkins S, Wagner AS (eds) Transputer research and applications (NATUG 6), Vancouver, Canada. IOS Press, Amsterdam, pp 153–169

    Google Scholar 

  2. Arif Wani M, Arabnia HR (2003) Parallel edge–region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43–62

    Article  MATH  Google Scholar 

  3. Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor—theoretical properties and algorithms. Parallel Comput 21(11):1783–1805

    Article  Google Scholar 

  4. Bini D, Pan V (1994) Polynomial and matrix computations, vol 1, fundamental algorithms. Birkhäuser, Boston

    Google Scholar 

  5. Coppersmith D, Winograd S (1990) Matrix multiplication via arithmetic progressions. J Symb Comput 9:251–280

    Article  MATH  MathSciNet  Google Scholar 

  6. Csanky L (1976) Fast parallel matrix inversion algorithms. SIAM J Comput 5:618–623

    Article  MATH  MathSciNet  Google Scholar 

  7. Dekel E, Nassimi D, Sahni S (1981) Parallel matrix and graph algorithms. SIAM J Comput 10:657–673

    Article  MATH  MathSciNet  Google Scholar 

  8. Eshaghian MM (1993) Parallel algorithms for image processing on OMC. IEEE Trans Comput 40:827–833

    Article  Google Scholar 

  9. Goldberg LA, Jerrum M, Leighton T, Rao S (1997) Doubly logarithmic communication algorithms for optical-communication parallel computers. SIAM J Comput 26:1100–1119

    Article  MATH  MathSciNet  Google Scholar 

  10. Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing, 2nd edn. Addison-Wesley, Harlow

    Google Scholar 

  11. Ibarra OH, Moran S, Rosier LE (1980) A note on the parallel complexity of computing the rank of order n matrices. Inf Process Lett 11(4–5):162

    Article  MathSciNet  Google Scholar 

  12. Le Verrier UJJ (1840) Sur les variations seculaires des elementes elliptiques des sept planets principales. J Math Pures Appl 5:220–254

    Google Scholar 

  13. Leighton FT (1992) Introduction to parallel algorithms and architectures: arrays, trees, hypercubes. Morgan Kaufmann, San Mateo

    MATH  Google Scholar 

  14. Li K (2001) Scalable parallel matrix multiplication on distributed memory parallel computers. J Parallel Distrib Comput 61(12):1709–1731

    Article  MATH  Google Scholar 

  15. Li K (2004) Fast and scalable parallel matrix computations with reconfigurable pipelined optical buses. Parallel Algorithms Appl 19(4):195–209

    MathSciNet  Google Scholar 

  16. Li K (2007) Analysis of parallel algorithms for matrix chain product and matrix powers on distributed memory systems. IEEE Trans Parallel Distrib Syst 18(7):865–878

    Article  Google Scholar 

  17. Li K (2008) Fast and scalable parallel matrix multiplication and its applications on distributed memory systems. In: Rajasekaran S, Reif J (eds) Parallel computing: models, algorithms, and applications. CRC Press, Boca Raton, Chap 47

    Google Scholar 

  18. Li K, Pan VY (2001) Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system. IEEE Trans Comput 50(5):519–525

    Article  MathSciNet  Google Scholar 

  19. Li K, Pan Y, Zheng SQ (1998) Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system. IEEE Trans Parallel Distrib Syst 9(8):705–720

    Article  Google Scholar 

  20. Mehlhorn K, Vishkin U (1984) Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Inf 21:339–374

    MATH  MathSciNet  Google Scholar 

  21. Pan V (1987) Complexity of parallel matrix computations. Theor Comput Sci 54:65–85

    Article  MATH  Google Scholar 

  22. Pan V, Reif J (1985) Efficient parallel solution of linear systems. In: Proceedings of 7th ACM symposium on theory of computing, May 1985, pp 143–152

  23. Pan Y, Li K (1998) Linear array with a reconfigurable pipelined bus system—concepts and applications. J Inf Sci 106(3–4):237–258

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keqin Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, K. Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems. J Supercomput 54, 271–297 (2010). https://doi.org/10.1007/s11227-009-0319-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-009-0319-0

Keywords

Navigation