Automatic parallelization of the conjugate gradient algorithm
The conjugate gradient (CG) method is a popular Krylov space method for solving systems of linear equations of the form Ax = b, where A is a symmetric positive-definite matrix. This method can be applied regardless of whether A is dense or sparse. In this paper, we show how restructuring compiler technology can be applied to transform a sequential, dense matrix CG program into a parallel, sparse matrix CG program. On the IBM SP-2, the performance of our compiled code is comparable to that of handwritten code from the PETSc library at Argonne.
Unable to display preview. Download preview PDF.
- [Bas95]Achim Basermann. Parallel sparse matrix computations in iterative solvers on distributed memory machines. In Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, February 1995. SIAM Press.Google Scholar
- [BKK+94]David Bau, Induprakas Kodukula, Vladimir Kotlyar, Keshav Pingali, and Paul Stodghil. Solving alignment using simple linear algebra. In K. Pingali, U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers for Parallel Computing. Seventh International Workshop., LNCS. LNCS, Springer-Verlag, 1994.Google Scholar
- [BKW94]Aart J.C. Bik, Peter M.W. Knijnenburg, and Harry A.G. Wijshoff. Reshaping access patterns for generating sparse codes. In Proceedings of the Seventh Annual Workshop on Languages and Compiler for Parallel Computing, Ithaca, New York, August 8–10, 1994. Springer-Verlag. LNCS #892.Google Scholar
- [BW93]Aart Bik and Harry Wijshoff. Advanced compiler optimizations for sparse computations. In Proceedings of Supercomputing 93, pages 430–439, November 1993.Google Scholar
- [DUSH94]Raja Das, Mustafa Uysal, Joel Saltz, and Yuan-Shin Hwang. Communication optimizations for irregular scientific computations on distributed memory architectures. Journal of Parallel and Distributed Computing, 22(3):462–479, September 1994. Also available as University of Maryland Technical Report CS-TR-3163 and UMIACS-TR-93-109.CrossRefGoogle Scholar
- [Mac94]PDEase Programmer's Manual. Macsyma Inc., 20 Academy Street, Arlington, MA 02174., 1994.Google Scholar
- [MTTV93]G. L. Miller, S.-H. Teng, W. Thurston, and S. A. Vavasis. Automatic mesh partitioning. In A. George, J. Gilbert, and J. Liu, editors, Graph Theory and Sparse Matrix Computation, volume 56 of IMA Volumes in Mathematics and its Applications. Springer-Verlag, Berlin, 1993.Google Scholar
- [RP89]Anne Rogers and Keshav Pingali. Process decomposition through locality of reference. In Proceedings of the SIGPLAN '89 Conference on Programming Language Design and Implementation, pages 69–80, Portland, Oregon, June 21–23, 1989. Published as ACM SIGPLAN Notices 24(7).Google Scholar
- [SG94]B. Smith and W. Gropp. Portable, parallel, reusable krylov space codes. In Colorado Conference on Iterative Methods, Colorado, April 1994.Google Scholar
- [WDS+95]Janet Wu, Raja Das, Joel Saltz, Harry Berryman, and Seema Hiranandani. Distributed memory compiler design for sparse problems. IEEE Transactions on Computers, 44(6), 1995.Google Scholar