The Journal of Supercomputing

, Volume 4, Issue 4, pp 357–371 | Cite as

Using Strassen's algorithm to accelerate the solution of linear systems

  • David H. Bailey
  • King Lee
  • Horst D. Simon


Strassen's algorithm for fast matrix-matrix multiplication has been implemented for matrices of arbitrary shapes on the CRAY-2 and CRAY Y-MP supercomputers. Several techniques have been used to reduce the scratch space requirement for this algorithm while simultaneously preserving a high level of performance. When the resulting Strassen-based matrix multiply routine is combined with some routines from the new LAPACK library, LU decomposition can be performed with rates significantly higher than those achieved by conventional means. We succeeded in factoring a 2048 × 2048 matrix on the CRAY Y-MP at a rate equivalent to 325 MFLOPS.

Key words

Strassen's algorithm fast matrix multiplication linear systems LAPACK vector computers AMS Subject Classification 65F05 65F30 68A20 CR Subject Classification F.2.1 G.1.3 G.4 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bailey, D.H. 1988. Extra high speed matrix multiplication on the CRAY-2. SIAM J. Sci. Stat. Comp., 9, 3: 603–607.Google Scholar
  2. Bischof, C., Demmel, J., Dongarra, J., DuCroz, J., Greenbaum, A., Hammarling, S., and Sorensen, D. 1988. LAPACK working note #5-Provisional contents. Tech. rept. ANL-88-38, Argonne Nat. Laboratory (Sept.).Google Scholar
  3. Brent, R.P. 1970. Algorithms for matrix multiplication. Tech. rept. CS 157, Comp. Sci. Dept., Stanford Univ.Google Scholar
  4. Coppersmith, D., and Winograd, S. 1987. Matrix multiplication via arithmetic progression. In Proc., 19th Annual ACM Symp. on the Theory of Computing, pp. 1–6.Google Scholar
  5. Cray Research, Inc. 1989. UNICOS Math and Scientific Library Reference Manual. No. SR-2081, Version 5.0 (Mar.).Google Scholar
  6. Dongarra, J.J., DuCroz, J., Duff, I., and Hammarling, S. 1988a. A set of Level 3 basic linear algebra subprograms. Tech. rept. MCS-P1-0888, MCSD, Argonne Nat. Laboratory (Aug.).Google Scholar
  7. Dongarra, J.J., DuCroz, J., Duff, I., and Hammarling, S. 1988b. A set of Level 3 basic linear algebra subprograms: Model implementation and test programs. Tech. rept. MCS-P2-0888, MCSD, Argonne Nat. Laboratory (Aug.).Google Scholar
  8. Gentleman, M.J. 1988. Private commun.Google Scholar
  9. Higham, N.J. 1989. Exploiting fast matrix multiplication within the Level 3 BLAS. Tech. rept. TR 89–984, Dept. of Comp. Sci., Cornell Univ., Ithaca, N.Y. (Apr.). (To appear in ACM TOMS.)Google Scholar
  10. Higham, N.J. 1990. Stability of a method for multiplying complex matrices with three real matrix multiplications. Numerical Analysis Rept. 181, Dept. of Math., Univ. of Manchester (Jan.).Google Scholar
  11. Miller, W. 1975. Computational complexity and numerical stability. SIAM J. Computing, 4: 97–107.Google Scholar
  12. Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, T. 1986. Numerical Recipes. Cambridge Univ. Press, N.Y.Google Scholar
  13. Strassen, V. 1969. Gaussian elimination is not optimal. Numer. Math., 13: 354–356.Google Scholar

Copyright information

© Kluwer Academic Publishers 1990

Authors and Affiliations

  • David H. Bailey
    • 1
  • King Lee
    • 2
  • Horst D. Simon
    • 3
  1. 1.NASA Ames Research CenterMoffett FieldUSA
  2. 2.Computer Science DepartmentCalifornia State UniversityBakersfieldUSA
  3. 3.Computer Sciences Corporation, NASA Ames Research CenterMoffett FieldUSA

Personalised recommendations