Efficient decomposition and performance of parallel PDE, FFT, Monte Carlo simulations, simplex, and Sparse solvers

Cvetanovic, Zarka; Freedman, Edward G.; Nofsinger, Charles

doi:10.1007/BF00127844

Efficient decomposition and performance of parallel PDE, FFT, Monte Carlo simulations, simplex, and Sparse solvers

Published: October 1991

Volume 5, pages 219–238, (1991)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Zarka Cvetanovic¹,
Edward G. Freedman² &
Charles Nofsinger³

68 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we describe the decomposition of six algorithms: two partial differential equations (PDE) solvers (successive over-relaxation [SOR] and alternating direction implicit [ADI]), fast Fourier transform (FFT), Monte Carlo simulations, Simplex linear programming, and Sparse solvers. The algorithms were selected not only because of their importance in scientific applications, but also because they represent a variety of computational (structured to irregular) and communication (low to high) requirements. We present the performance results of these algorithms on two shared-memory VAX/VMS^TM1 multiprocessor prototypes: the VAX 6300 series with up to 8 processors and the M31 with up to 22 processors. We demonstrate that by efficient decomposition it is possible to achieve high performance for all algorithms on both prototypes. We describe the efficient decomposition techniques applied to optimize the performance of parallel algorithms. Also, we discuss the performance implications due to different cache designs on two multiprocessors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel-in-Time High-Order Multiderivative IMEX Solvers

Article 17 December 2021

FEMPAR: An Object-Oriented Parallel Finite Element Framework

Article Open access 11 October 2017

A Multithreaded Recursive and Nonrecursive Parallel Sparse Direct Solver

References

Anderson, E.C. 1988. Parallel implementation of preconditioned conjugate gradient methods for solving sparse systems of linear equations. Master's thesis, Comp. Sci. Dept., Univ. of Ill. at Urbana-Champaign, Urbana, Ill.
Google Scholar
Baxter, D., Saltz, J., Schultz, M., and Eisenstat, S. 1988. Preconditioned Krylov solvers and methods for run-time loop parallelization. Yale Rept. YALEU/DCS/RT-655 (Oct.).
Burdick, S., and Schwetman, H. 1988. Parallelizing an electron transport Monte Carlo. MCC Tech. Rept., ACA-ST/CAD-328-87 (Jan.).
Cvetanovic, Z. 1987. Performance analysis of the FFT algorithm on a shared-memory parallel architecture. IBM J. Res. and Dev., 31, 4 (July), 435–451.
Google Scholar
Digital Equipment. 1988a. Guide to Parallel Programming on VMS. Order No.: AA-LB38A-TE, Digital Equipment Corp.
Digital Equipment. 1988b. VAX 6300 System Technical User's Guide. Digital Equipment Corp.
Digital Equipment. 1988c. VAX FORTRAN User Manual. Order No.: AA-D035E-TE, Digital Equipment Corp.
Digital Equipment. 1988d. VMS RTL Parallel Processing (PPL$) Manual. Order No.: AA-LA74A-TE, Digital Equipment Corp.
Fatoohi, R.A., and Grosh, C.E. 1987. Implementation of an ADI method on parallel computers. ICASE Rept. No. 87-43 (July).
Hockney, R.W., and Jesshope, C.R. 1981. Parallel Computers. Adam Hilger Ltd., Bristol.
Google Scholar
Johnsson, L., Saad, Y., and Schultz, M. 1987. Alternating direction methods on multiprocessors. SIAM J. Sci. Stat. Comp., 8, 5 (Sept.), 668–700.
Google Scholar
Kunzi, H., Tzschach, H., and Zehnder, C. 1971. Numerical Methods of Mathematical Optimization. Academic Press.
Lambiotte, J.J. 1978. An alternating direction implicit method for the Control Data STAR-100 Vector Computer. NASA Tech. Paper 1282 (Sept.).
Liu, J.W.H. 1986. Computational models and task scheduling for parallel sparse Cholesky factorization. Parallel Computing, 3: 327–342.
Google Scholar
Luenberger, D. 1984. Linear and Nonlinear Programming. Addison-Wesley.
Murty, K. 1983. Linear Programming. John Wiley & Sons.
Norton, V.A., and Silberger, A. 1986. Parallelization and performance prediction of the Cooley-Tukey algorithm for shared-memory architectures. IBM Rept. RC-11885, IBM Thomas J. Watson Res. Center (May).
Ortega, J., and Voigt, R. 1985. Solution of partial differential equations on vector and parallel computers. SIAM Review, 27, 2 (June), 149–240.
Google Scholar
Pease, M.C. 1968. An adaptation of the fast Fourier transform for parallel processing. JACM, 15 (Apr.), 252–264.
Google Scholar
Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, W.T. 1986. Numerical Recipes. Cambridge Univ. Press, Cambridge, Mass.
Google Scholar
Reilly, M., and Sopka, J. 1988. M31: A large-scale multiprocessor VAX for parallel processing research. In Conf. Proc.—COMPCON Spring '88 (San Francisco, Feb. 29–Mar. 4), IEEE Comp. Soc. Press, pp. 200–206.
Sadayappan, P., and Visvanathan, V. 1988. Modeling and optimal scheduling of parallel sparse Gaussian elimination. Proc., ICPP, vol. 3 (Aug.), pp. 54–61.
Google Scholar
Salkin, H., and Saha, J. 1975. Studies in Linear Programming. North-Holland.
Saltz, J., and Naik, K. 1988. Towards developing robust algorithms for solving partial differential equations on MIMD machines. Parallel Computing, 6: 19–44.
Google Scholar
Saltz, J., Mirchandaney, R., and Baxter, D. 1988. Run-time parallelization and scheduling of loops. ICASE Rept. No. 88-70 (Dec.).
Stunkel, C. 1988. Linear optimization via message-based parallel processing. In Proc., 1988 Internat. Conf. on Parallel Processing (St. Charles, Ill., Aug. 15–19), Penn. Univ. Press, pp. 264–271.
Stunkel, C., and Reed, D. 1988. Hypercube implementation of the Simplex algorithm. In Proc., Third Conf. on Hypercube Concurrent Computers and Applications (Pasadena, Calif.), ACM Press, pp. 1473–1482.
Varga, R.S. 1962. Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs, N.J.
Google Scholar
Wang, H.H. 1981. A parallel method for tridiagonal equations. ACM Trans. Math. Software, 7, 2 (June), 170–183.
Google Scholar
Whiteside, R.A., Hibbard, P.G., and Ostlund, N.S. 1982. Systolic algorithms for Monte Carlo simulations. In Proc., Third Internat. Conf. on Distributed Computing Systems (Miami/Ft. Lauderdale, Fla., Oct. 18–22), pp. 800–804.
Young, D.M. 1971. Iterative Solution of Large Linear Systems. Academic Press, New York.
Google Scholar
Zhang, X. 1988. Parallel block SOR methods for solving Poisson equations on shared and local memory multiprocessors. In Proc., 1988 Internat. Conf. on Parallel Processing (St. Charles, Ill., Aug. 15–19), Penn. Univ. Press, pp. 473–479.

Download references

Author information

Authors and Affiliations

Digital Equipment Corporation, 60 Codman Hill Rd., 01719, Boxborough, MA, USA
Zarka Cvetanovic
Digital Equipment Corporation, 77 Reed Road, 01749-2895, Hudson, MA, USA
Edward G. Freedman
Tradenet, Inc., 101 Main St., 02142, Cambridge, MA, USA
Charles Nofsinger

Authors

Zarka Cvetanovic
View author publications
You can also search for this author in PubMed Google Scholar
Edward G. Freedman
View author publications
You can also search for this author in PubMed Google Scholar
Charles Nofsinger
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

At the time of writing, all three authors were with Digital Equipment Corporation, VMS Systems and Servers Group.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cvetanovic, Z., Freedman, E.G. & Nofsinger, C. Efficient decomposition and performance of parallel PDE, FFT, Monte Carlo simulations, simplex, and Sparse solvers. J Supercomput 5, 219–238 (1991). https://doi.org/10.1007/BF00127844

Download citation

Issue Date: October 1991
DOI: https://doi.org/10.1007/BF00127844

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient decomposition and performance of parallel PDE, FFT, Monte Carlo simulations, simplex, and Sparse solvers

Abstract

Access this article

Similar content being viewed by others

Parallel-in-Time High-Order Multiderivative IMEX Solvers

FEMPAR: An Object-Oriented Parallel Finite Element Framework

A Multithreaded Recursive and Nonrecursive Parallel Sparse Direct Solver

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient decomposition and performance of parallel PDE, FFT, Monte Carlo simulations, simplex, and Sparse solvers

Abstract

Access this article

Similar content being viewed by others

Parallel-in-Time High-Order Multiderivative IMEX Solvers

FEMPAR: An Object-Oriented Parallel Finite Element Framework

A Multithreaded Recursive and Nonrecursive Parallel Sparse Direct Solver

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation