Runtime support for multi-tier programming of block-structured applications on SMP clusters

  • Stephen J. Fink
  • Scott B. Baden
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1343)

Abstract

We present a small set of programming abstractions to simplify efficient implementations for block-structured scientific calculations on SMP clusters. We have implemented these abstractions in KeLP 2.0, a C++ class library. KeLP 2.0 provides hierarchical SMPD control flow to manage two levels of parallelism and locality. Additionally, to tolerate slow inter-node communication costs, KeLP 2.0 combines inspector/executor communication analysis with overlap of communication and computation. We illustrate how these programming abstractions hide the low-level details of thread management, scheduling, synchronization, and message-passing, but allow the programmer to express efficient algorithms with intuitive geometric primitives.

Keywords

Node Level Ghost Cell Class Library Collective Operation Thread Management 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    P. R. Woodward, “Perspectives on supercomputing: Three decades of change,” IEEE Computer, vol. 29, pp. 99–111, October 1996.Google Scholar
  2. 2.
    Message-Passing Interface Standard, “MPI: A message-passing interface standard,” University of Tennessee, Knoxville, TN, Jun. 1995.Google Scholar
  3. 3.
    High Performance Fortran Forum, High Performance Fortran Language Specification, Nov. 1994.Google Scholar
  4. 4.
    W. W. Gropp and E. L. Lusk, “A taxonomy of programming models for symmetric multiprocessors and SMP clusters,” in Proceedings 1995: Programming models for massively parallel computers, pp. 2–7, October 1995.Google Scholar
  5. 5.
    S. J. Fink, S. B. Baden, and S. R. Kohn, “Flexible communication mechanisms for dynamic structured applications,” in Proc. 3rd Int'l Workshop IRREGULAR '96, pp. 203–215, Aug. 1996.Google Scholar
  6. 6.
    S. R. Kohn, A Parallel Software Infrastructure for Dynamic Block-Irregular Scientific Calculations. PhD thesis, University of CA at San Diego, 1995.Google Scholar
  7. 7.
    G. Agrawal, A. Sussman, and J. Saltz, “An integrated runtime and compile-time approach for parallelizing structured and block structured applications,” IEEE Transactions on Parallel and Distributed Systems, vol. 6, Jul. 1995.Google Scholar
  8. 8.
    W. Gropp, E. Lusk, N. Doss, and A. Skjellum, “A high-performance, portable implementation of the MPI message passing interface standard,” tech. rep., Argonne National Laboratory, Argonne, IL, 1997. http://www.mcs.ani.gov/mpi/mpich/.Google Scholar
  9. 9.
    R. van de Geign and J. Watts, SUMMA:Scalable universal matrix multiplication algorithm,” Concurrency: Practice and Experience, vol. 9, pp. 255–74, April 1997.CrossRefGoogle Scholar
  10. 10.
    S. J. Fink, “Hierrachical programming for block-structured scientific calculations.” In preparation.Google Scholar
  11. 11.
    D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow, “The NAS parallel benchmarks 2.0,” Tech. Rep. NAS-95-020, NASA Ames Research Center, December 1995.Google Scholar
  12. 12.
    R. C. Agarwal, F. G. Gustavson, and M. Zubair, “An efficient parallel algorithm for the 3-d FFT NAS parallel in Proc. of SHPCC '94, pp. 129–133, May 1994.Google Scholar
  13. 13.
    L. Snyder, “Foundations of practical parallel programming languages,” in Portability and Performance of Parallel Processing (T. Hey and J. Ferrante, eds.), John Wiley and Sons, 1993.Google Scholar
  14. 14.
    B. Alpern, L. Carter, and J. Ferrante, “Modeling parallel computers as memory hierarchies,” in Programming Models for Massively Parallel Computers (W. K. Giloi, S. Jahnichen, and B. D. Shriver, eds.), pp. 116–23, IEEE Computer Society Press, Sept. 1993.Google Scholar
  15. 15.
    R. Eigenmann, J. Hoeflinger, G. Jaxson, and D. Padua, “Cedar Fortran and its compiler,” in CONPAR 90-VAPP IV. Joint International Conference on Vector and Parallel Parocessing, pp. 288–99, 1990.Google Scholar
  16. 16.
    A. C. Sawdey, M. T. O'Keefe, and W. B. Jones, general programming model for developing scalable ocean circulation applications,” in Proceedings of the ECMWF Workshop on the Use of Parallel Processors in Meteorology, January 1997.Google Scholar
  17. 17.
    D. A. Bader and J. JáJá “SIMPLE: A methodology for programming high performance algorithms on clusters of symmetric multiprocessors.” Preliminary Version, http://www.umiacs.umd.edu/research/EXPAR/papers/3798.html.Google Scholar

Copyright information

© Springer-Verlag 1997

Authors and Affiliations

  • Stephen J. Fink
    • 1
  • Scott B. Baden
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of CaliforniaLa Jolla

Personalised recommendations