The Journal of Supercomputing

, Volume 10, Issue 1, pp 5–24 | Cite as

A quantitative study of parallel scientific applications with explicit communication

  • Robert Cypher
  • Alex Ho
  • Smaragda Konstantinidou
  • Paul Messina


This paper studies the behavior of scientific applications running on distributed memory parallel computers. Our goal is to quantify the floating point, memory, I/O, and communication requirements of highly parallel scientific applications that perform explicit communication. In addition to quantifying these requirements for fixed problem sizes and numbers of processors, we develop analytical models for the effects of changing the problem size and the degree of parallelism for several of the applications.

The contribution of our paper is that it provides quantitative data about real parallel scientific applications in a manner that is largely independent of the specific machine on which the application was run. Such data, which are clearly very valuable to an architect who is designing a new parallel computer, were not previously available. For example, the majority of research papers in interconnection networks have used simulated communication loads consisting of fixed-size messages. Our data, which show that using such simulated loads is unrealistic, can be used to generate more realistic communication loads.


Multicomputers parallel applications communication scalability I/O characteristics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. Agarwal and A. Gupta. Memory-reference characteristics of multiprocessor applications under MACH. In Proceedings, 1988 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 215–225, 1988.Google Scholar
  2. 2.
    C. Baillie and D. Walker. Lattice QCD—As a large scale scientific computation. Technical Report C3P-641, California Institute of Technology, 1988.Google Scholar
  3. 3.
    F. Darema-Rogers, G. Pfister, and K. So. Memory access patterns of parallel scientific programs. In Proceedings, 1987 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 46–57, 1987.Google Scholar
  4. 4.
    J. Flower. Lattice gauge theory on a parallel computer. Ph.D. thesis, California Institute of Technology, 1987.Google Scholar
  5. 5.
    J.-M. Hsu and P. Banerjee. Performance measurement and trace driven simulation of parallel CAD and numeric applications on a hypercube multicomputer. In Proceedings, 17th Annual International Symposium on Computer Architecture, pp. 260–269, 1990.Google Scholar
  6. 6.
    D. Kotz and N. Nieuwejaar. Dynamic file-access characteristics of a production parallel scientific workload. In Proceedings, Supercomputing '94, pp. 640–649, 1994.Google Scholar
  7. 7.
    A. Leonard. Vortex methods for flow simulation.J. Computational Physics, 37: 289, 1980.Google Scholar
  8. 8.
    R. Lucas, K. Wu, and R. Dutton. A parallel 3-D Poisson solver on a hypercube multiprocessor. In Proceedings, IEEE International Conference on Computer-Aided Design, pp. 442–445, 1987.Google Scholar
  9. 9.
    P. Li and D. Curkendall. Parallel 3-D perspective rendering. In Proceedings, First Intel Delta Applications Workshop, Technical Report CCSF-14-92, California Institute of Technology, pp. 52–58, 1992.Google Scholar
  10. 10.
    P. Messina, C. Baillie, E. Felten, P. Hipes, R. Williams, A. Alagar, A. Kamrath, R. Leary, W. Pfeiffer, J. Rogers, and D. Walker. Benchmarking advanced architecture computers.Concurrency: Practice and Experience, 2(3): 195–255, September 1990.Google Scholar
  11. 11.
    S. Otto. Monte Carlo methods in lattice gauge theories. Ph.D. thesis, California Institute of Technology, 1983.Google Scholar
  12. 12.
    S. Plimpton and G. Heffelfinger. Scalable parallel molecular dynamics on MIMD supercomputers. In Proceedings, Scalable High Performance Computing Conference, pp. 246–251, April 1992.Google Scholar
  13. 13.
    A. Reddy and P. Banerjee. A study of I/O behavior of perfect benchmarks on a multiprocessor. In Proceedings, 17th Annual International Symposium on Computer Architecture, pp. 312–321, 1990.Google Scholar
  14. 14.
    J.M. del Rosario and A. Choudhary. High performance I/O for parallel computers: Problems and prospects.IEEE Computer, 27,(3): 59–68, 1994.Google Scholar
  15. 15.
    E. Rothberg, J.P. Singh, and A. Gupta. Working sets, cache sizes, and node granularity issues for large-scale multiprocessors. In Proceedings, 20th Annual International Symposium on Computer Architecture, pp. 14–26, 1993.Google Scholar
  16. 16.
    R. Williams. Performance of dynamic load balancing algorithms for unstructured mesh calculations.Concurrency, 3: 457–481, 1991.Google Scholar
  17. 17.
    K. Wu, G. Chin, and R. Dutton. A STRIDE towards practical 3-D device simulation—Numerical and visualization considerations.IEEE Transactions on Computer-Aided Design, 10,(9): 1132–1140, 1991.Google Scholar
  18. 18.
    M. Wu, S. Cuccaro, P. Hipes, and A. Kuppermann. Quantum mechanical reactive scattering using a high-performance distributed-memory parallel computer.Chem. Phys. Lett., 168: 429–440, 1990.Google Scholar
  19. 19.
    M. Wu, A. Kuppermann, and B. Lepetit. Theoretical calculation experimentally observable consequences of the geometric phase on chemical reaction cross sections.Chem. Phys. Lett., 186: 319–328, 1991.Google Scholar
  20. 20.
    M. Wu and A. Kuppermann. Prediction of the effect of the geometric phase on product rotational state distributions and integral cross sections.Chem. Phys. Lett., 201: 178–186, 1993.Google Scholar

Copyright information

© Kluwer Academic Publishers 1996

Authors and Affiliations

  • Robert Cypher
    • 1
  • Alex Ho
    • 2
  • Smaragda Konstantinidou
    • 3
  • Paul Messina
    • 4
  1. 1.Computer Science DepartmentJohns Hopkins UniversityBaltimoreUSA
  2. 2.IBM Research Division, Almaden Research CenterSan JoseUSA
  3. 3.Computer Science DepartmentJohns Hopkins UniversityBaltimoreUSA
  4. 4.Caltech Concurrent Supercomputing Facilities, California Institute of TechnologyPasadenaUSA

Personalised recommendations