Advertisement

Domain decomposition in distributed and shared memory environments

I: A uniform decomposition and performance analysis for the NCUBE and JPL mark IIIfp hypercubes
  • Geoffrey C. Fox
Session 11: Parallel Processing IV
Part of the Lecture Notes in Computer Science book series (LNCS, volume 297)

Abstract

We describe how explicit domain decomposition can lead to implementations of large scale scientific applications which run with near optimal performance on concurrent supercomputers with a variety of architectures. In particular, we show how one can discuss from a uniform point of view two architectural characteristics; distributed memory and hierarchical memory where a large relatively slow memory is buffered by a faster cache or local memory. We consider two hypercubes in particular; the commercial NCUBE and JPL's Mark IIIfp with hierarchical memory at each node of a hypercube. We remark on the application of these ideas to other architectures and other concurrent computers. We present a performance analysis in terms of basic parameters describing the hardware and the applications.

Keywords

Shared Memory Main Memory Domain Decomposition Global Memory Memory Hierarchy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    C3P-394: "Caltech Supercomputer Initiative: A Commitment to Leadership and Excellence," A. H. Barr, R. W. Clayton, A. Kuppermann, L. G. Leal, A. Leonard, T. A. Prince, December 29, 1986.Google Scholar
  2. 2.
    G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, D. Walker, "Solving Problems on Concurrent Processors," April 1986. To be published by Prentice Hall, 1987.Google Scholar
  3. 3.
    C3P-391: "The Hypercube as a Supercomputer," G. C. Fox, January 7, 1987. Published by the International Supercomputing Institute, Inc., St. Petersburg, Florida, May 1987.Google Scholar
  4. 4.
    C3P-409: "Concurrent Supercomputer Initiative at Caltech," G. C. Fox, January 31, 1987. Published by the International Supercomputing Institute, Inc., St. Petersburg, Florida, May 1987.Google Scholar
  5. 5.
    "Portable Programs for Parallel Processors," J. Boyle, R. Butler, T. Disz, B. Glickfield, E. Lusk, R. Overbeek, J. Patterson, R. Stevens, published by Holt, Rinehart, and Winston, Inc., N. Y. 1987.Google Scholar
  6. 6.
    C3P-435: "The Concurrent Supercomputing Initiative at Caltech," G. Fox and co-authors.Google Scholar
  7. 7.
    C3P-255: "Concurrent Computation and the Theory of Complex Systems," G. C. Fox, S. W. Otto, March 3, 1986. Published in proceedings of 1985 Hypercube Conference at Knoxville, August 1985, edited by M. Heath and published by SIAM.Google Scholar
  8. 8.
    C3P-214: "Monte Carlo Physics on a Concurrent Processor," G. C. Fox, S. W. Otto, E. A. Umland, Nov 6, 1985, invited talk by G. Fox at the "Frontiers of Quantum Monte Carlo" Conference at Los Alamos, September 6, 1985, and published in special issue of Journal of Statistical Physics, Vol. 43,1209, Plenum Press, 1986.Google Scholar
  9. 9.
    C3P-161: "The Performance of the Caltech Hypercube in Scientific Calculations: A Preliminary Analysis:" G. Fox, April 1985, Invited Talk at Symposium in Austin, Texas, March 18–20, 1985, and published in "Supercomputers-Algorithms, Architectures and Scientific Computation," edited by F. A. Matsen and T. Tajima, University of Texas Press, Austin, 1985.Google Scholar
  10. 10.
    C3P-292: "A Preprocessor for Irregular Finite Element Problems," CALT-68-1405, J. W. Flower, S. W. Otto, M. C. Salama, June 1986.Google Scholar
  11. 11.
    C3P-363: "Load Balancing by a Neural Network," CALT-68-1408, G. C. Fox, W. Furmanski, September 1986.Google Scholar
  12. 12.
    C3P-327B: "A Graphical Approach to Load Balancing and Sparse Matrix Vector Multiplication on the Hypercube," G. C. Fox, December 5, 1986. To be published in proceedings of IMA Workshop, Minnesota, November 1986.Google Scholar
  13. 13.
    C3P-385: "A Review of Automatic Load Balancing and Decomposition Methods for the Hypercube," G. C. Fox, November 1986. To be published in proceedings of IMA Workshop, Minnesota, November 1986.Google Scholar
  14. 14.
    C3P-328: "The Implementation of a Dynamic Load Balancer," G. Fox, A. Kolawa, R. Williams, November 1986, published in proceedings of 1986 Knoxville Hypercube Conference, edited by M. Heath and published by SIAM as "Hypercube Multiprocessors 1987."Google Scholar
  15. 15.
    C3P-427: "A MOOSE Status Report," J. Salmon, S. Callahan, J. Flower, A. Kolawa, May 6, 1987.Google Scholar
  16. 16.
    C3P-390: "An Evaluation of Mark III and NCUBE Supercomputers," G. C. Fox, December 9, 1986.Google Scholar
  17. 17.
    J. Barnes, P. Hut, "A Hierarchical O(NlogN) Force-Calculation Algorithm," Nature 324,446 (1986).Google Scholar
  18. 18.
    C3P-206: "Matrix Algorithms on the Hypercube I: Matrix Multiplication," G. Fox, A. J. G. Hey, S. Otto, October 1985, published in Parallel Computing, 4, 17 (1987).Google Scholar
  19. 19.
    C3P-97: "Square Matrix Decompositions: Symmetric, Local, Scattered," G. Fox, August 15, 1984, unpublished.Google Scholar
  20. 20.
    C3P-99: "LU Decomposition for Banded Matrices," G. C. Fox, August 15, 1984, unpublished.Google Scholar
  21. 21.
    C3P-347: "Gauss Jordan Matrix Inversion with Pivoting on the Hypercube," P. Hipes, A. Kuppermann, August 8, 1986.Google Scholar
  22. 22.
    C3P-348: "A Banded Matrix LU Decomposition on the Hypercube," T. Aldcroft, A. Cisneros, G. Fox, W. Furmanski, D. Walker, paper in preparation.Google Scholar
  23. 23.
    C3P-386: "Matrix," G. C. Fox and W. Furmanski, paper in preparation.Google Scholar
  24. 24.
    G. A. Geist, M. T. Heath "Matrix Factorization on a Hypercube Multiprocessor;" C. Moler "Matrix Computation on Distributed Memory Multiprocessor." Both these articles are contained in "Hypercube Multiprocessors, 1986", edited by M. T. Heath, SIAM, 1986.Google Scholar
  25. 25.
    J. Dongarra, Invited talk at 1987 International Conference on Supercomputing, Athens, June 8–12, 1987.Google Scholar
  26. 26.
    C3P-405: "Hypercube Communication for Neural Network Algorithms," G. C. Fox, W. Furmanski, paper in preparation.Google Scholar
  27. 27.
    C3P-404: "Piriform (Olfactory) Cortex Model on the Hypercube," J. M. Bower, M. E. Nelson, M. A. Wilson, G. C. Fox, W. Furmanski, February 1987.Google Scholar
  28. 28.
    D. Gannon, Invited talk at 1987 International Conference on Supercomputing, Athens, June 8–12, 1987.Google Scholar
  29. 29.
    K. Kennedy, Invited talk at 1987 International Conference on Supercomputing, Athens, June 8–12, 1987.Google Scholar
  30. 30.
    A. Sameh, "Numerical Algorithms on the Cedar System," Second SIAM Conference on Parallel Processing, Norfolk, Virginia, November 1985.Google Scholar
  31. 31.
    W. Jalby and U. Meier, "Optimizing Matrix Operations on a Parallel Multiprocessor with a Hierarchical Memory System," CSRD-555, University of Illinois report, 1986.Google Scholar
  32. 32.
    D. J. Kuck, E. S. Davidson, D. H. Lawrie, A. H. Sameh, "Parallel Supercomputing Today and the Cedar Approach," Science 231, 967 (1986).Google Scholar
  33. 33.
    C3P-314: "Optimal Communication Algorithms on the Hypercube," G. C. Fox, W. Furmanski, July 8, 1986; "Communication Algorithms for Regular Convolutions on the Hypercube," G. C. Fox, W. Furmanski, September 1, 1986, published in proceedings of 1986 Knoxville Hypercube Conference, edited by M. Heath and published by SIAM as "Hypercube Multiprocessors, 1987."Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1988

Authors and Affiliations

  • Geoffrey C. Fox
    • 1
  1. 1.Caltech Concurrent Computation Program Mail Code 158-79California Institute of TechnologyPasadena

Personalised recommendations