Abstract
The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. We study the BSP complexity of Gaussian elimination and related problems. First, we analyze the Gaussian elimination without pivoting, which can be applied to the LU decomposition of symmetric positive-definite or diagonally dominant real matrices. Then we analyze the Gaussian elimination with Schönhage's recursive local pivoting suitable for the LU decomposition of matrices over a finite field, and for the QR decomposition of real matrices by the Givens rotations. Both versions of Gaussian elimination can be performed with an optimal amount of local computation, but optimal communication and synchronization costs cannot be achieved simultaneously. The algorithms presented in the paper allow one to trade off communication and synchronization costs in a certain range, achieving optimal or near-optimal cost values at the extremes. Bibliography: 19 titles.
REFERENCES
A. Aggarwal, A. K. Chandra, and M. Snir, "Communication complexity of PRAMs," Theor. Comp. Sci., 71, 3-28 (1990).
Yu. D. Burago and V. A. Zalgaller, Geometric Inequalities (Grundlehren Math. Wiss., 285), Springer-Verlag (1988).
J. Choi et al., "The design and implementation of the ScaLAPACK LU, QR and Cholesky factorization routines," Technical Report ORNL/TM-12470 (1994).
D. Coppersmith and S. Winograd, "Matrix multiplication via arithmetic progressions," J. Symb. Comp., 9, 251-280 (1990).
D. Coppersmith and S. Winograd, "Matrix multiplication via arithmetic progressions," in: Computational Algebraic Complexity, E. Kaltofen (ed.), Academic Press (1990), pp. 23-52.
J. W. Demmel, N. J. Higham, and R. S. Schreiber, "Block LU factorization," Numer. Linear Algebra Appl., 2 (1995).
K. A. Gallivan, R. J. Plemmons, and A. H. Semeh, "Parallel algorithms for dense linear algebra computations," SIAM Review, 32, 54-135 (1990).
H. Hadwiger, Vorlesungen über Inhalt, Oberfläche und Isoperimetrie (Grundlehren Math. Wiss., 93), Springer-Verlag (1957).
J. E. Hopcroft and L. R. Kerr, "On minimizing the number of multiplications necessary for matrix multiplication," SIAM J. Appl. Math., 20, 30-36 (1971).
L. H. Loomis and H. Whitney, "An inequality related to the isoperimetric inequality," Bull. Am. Math. Soc., 55, 961-962 (1949).
W. F. McColl, "Scalable computing," Lect. Notes Comp. Sci., 1000, 46-61 (1995).
W. F. McColl, "A BSP realization of Strassen's algorithm," in: Abstract Machine Models for Parallel and Distribution Computing, M. Kara, J. R. Davy, D. Goodeve, and J. Nash (eds.), IOS Press (1966).
W. F. McColl, "Universal computing," Lect. Notes Comp. Sci., 1123, 25-36 (1996).
J. J. Modi, Parallel Algorithms and Matrix Computation, Clarendon Press (1988).
J. M. Ortega, Introduction to Parallel and Vector Solution of Linear Systems, Plenum Press (1988).
M. S. Paterson, Private Communication (1993).
L. G. Valiant, "Bulk-synchronous computers," in: Parallel Processing and Arti_cial Intelligence, M. Reeve (ed.), John Wiley & Sons (1989), pp. 15-22.
L. G. Valiant, "A bridging model for parallel computation," Commun. ACM, 33, 103-111 (1990).
L. G. Valiant, "General purpose parallel architectures," in: Handbook of Theoretical Computer Science, J. van Leeuwen (ed.) (1990), pp. 943-971.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Tiskin, A. Bulk-Synchronous Parallel Gaussian Elimination. Journal of Mathematical Sciences 108, 977–991 (2002). https://doi.org/10.1023/A:1013588221172
Issue Date:
DOI: https://doi.org/10.1023/A:1013588221172