OREGAMI: Tools for mapping parallel computations to parallel architectures

  • Virginia M. Lo
  • Sanjay Rajopadhye
  • Samik Gupta
  • David Keldsen
  • Moataz A. Mohamed
  • Bill Nitzberg
  • Jan Arne Telle
  • Xiaoxiong Zhong
Article

Abstract

The OREGAMI project involves the design, implementation, and testing of algorithms for mapping parallel computations to message-passing parallel architectures. OREGAMI addresses the mapping problem by exploiting regularity and by allowing the user to guide and evaluate mapping decisions made by OREGAMI's efficient combinatorial mapping algorithms. OREGAMI's approach to mapping is based on a new graph theoretic model of parallel computation called the Temporal Communication Graph. The OREGAMI software tools include three components: (1) LaRCS is a graph description language which allows the user to describe regularity in the communication topology as well as the temporal communication behavior (the pattern of message-passing over time). (2) MAPPER is our library of mapping algorithms which utilize information provided by LaRCS to perform contraction, embedding, and routing. (3) METRICS is an interactive graphics tool for display and analysis of mappings. This paper gives an overview of the OREGAMI project, the software tools, and OREGAMI's mapping algorithms.

Key Words

Mapping routing embedding task assignement regular parallel computations parallel programming environments 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    M. Rosing, R. B. Schnabel, and R. P. Weaver, The Dino Parallel Programming Language, Technical Report CU-CS-457-90, Department of Computer Science, University of Colorado at Boulder (April 1990).Google Scholar
  2. 2.
    M. H. Coffin, Par: An Approach to Architecture-independent Parallel Programming, Technical Report TR90-28, Department of Computer Science, University of Arizona (August 1990).Google Scholar
  3. 3.
    H. S. Stone, Multiprocessor Scheduling with the Aid of Network Flow Algorithms,IEEE Trans. on Software Engineering SE-3(1):85–93 (January 1977).Google Scholar
  4. 4.
    V. M. Lo, Temporal Communication Graphs: Lamport's Process-time Graphs Augmented for the Purpose of Mapping and Scheduling, Technical Report CIS-TR-92-04, University of Oregon (1992) (To appearJ. Parallel and Distrib. Computing.)Google Scholar
  5. 5.
    L. Lamport, Time, Clocks, and the Ordering of Events in a Distributed System.Comm. of the ACM 21(7):558–565 (July 1978).Google Scholar
  6. 6.
    C. L. Seitz, The Cosmic Cube,Comm. of the ACM 28(1):22–33 (January 1985).Google Scholar
  7. 7.
    C. D. Polychronopoulos,Parallel Programming and Compilers, Kluwer Academic Publishers (1988).Google Scholar
  8. 8.
    S. H. Bokhari,Assignment Problems in Parallel and Distributed Computing, Kluwer Academic Publishers (1987).Google Scholar
  9. 9.
    F. Berman and B. Stramm, Prep-P: Evolution and Overview, Technical Report CS89-158, Department of Computer Science, University of California at San Diego (1989).Google Scholar
  10. 10.
    V. M. Lo: Heuristic Algorithms for Task Assignment in Distributed Systems,IEEE Trans. on Comput. 37(11):1384–1397 (1988).Google Scholar
  11. 11.
    P. Sadayappan, F. Ercal, and J. Ramanujam, Clustering Partitioning Approaches to Mapping Parallel Programs onto a Hypercube,Parallel Computing 13:1–16 (1990).Google Scholar
  12. 12.
    J. C. Browne, Framework for Formulation and Analysis of Parallel Computation Structures,Parallel Computing 3:1–9 (1986).Google Scholar
  13. 13.
    F. Berman, Experience with an Automatic Solution to the Mapping Problem,The Characteristics of Parallel Algorithms, The MIT Pres, pp. 307–334 (1987).Google Scholar
  14. 14.
    L Snyder, Introduction to the Configurable, Highly Parallel Computer,Computer 15(1):47–56 (January 1982).Google Scholar
  15. 15.
    A. Wagner, S. Chanson, N. Goldstein, J. Jiang, H. Larsen, and H. Sreekantaswamy, TIPS: Transputer-based Interactive Parallelizing System, Technical Report, Department of Computer Science, University of British Columbia (1990).Google Scholar
  16. 16.
    J. C. Browne, Code: A Unified Approach to Parallel Programming.IEEE Software 6(4):10–19 (July 1989).Google Scholar
  17. 17.
    H. El-Rewini and T. G. Lewis, Scheduling Parallel Program Tasks onto Arbitrary Target Machines.J. of Parallel and Distrib. Computing 9:138–153 (1990).Google Scholar
  18. 18.
    C. D. Polychronopoulos,Parallel Programming and Compilers, Kluwer Academic Publishers (1988).Google Scholar
  19. 19.
    V. Sakar, Partitioning and Scheduling Parallel Programs for Execution on Multiprocessors, Technical Report, Ph.D. Thesis, Department of Computer Science, Stanford University (1987).Google Scholar
  20. 20.
    A. L. Rosenberg, Graph Embeddings 1988: Recent Breakthroughs New Directions, Technical Report 88-28, University of Massachusetts at Amherst (March 1988).Google Scholar
  21. 21.
    F. Berman and L. Snyder, On Mapping Parallel Algorithms into Parallel Architectures,J. of Parallel and Distrib. Computing 4(5):439–458 (October 1987).Google Scholar
  22. 22.
    S. V. Rajopadhye and R. M. Fujimoto, Synthesizing Systolic Arrays from Recurrence Equations,Parallel Computing 14:163–189 (June 1990).Google Scholar
  23. 23.
    Marina C. Chen, A Design Methodology for Synthesizing Parallel Algorithms and Architectures,J. of Parallel and Distrib. Computing 3(6):461–491 (December 1986).Google Scholar
  24. 24.
    D. D. Kandlur and K. G. Shin, Traffic Routing for Multi-computer Networks with Virtual Cut-through Capability,Proc. of the 10th Int'l Conf. on Distrib. Comput. Syst., pp. 398–405 (May 1990).Google Scholar
  25. 25.
    B. P. Bianchini and J. P. Shen, Interprocessor Traffic Scheduling Algorithm for Multiprocessor Network,IEEE Trans. on Comput. C-36(4):396–409 (April 1987).Google Scholar
  26. 26.
    Simon M. Kaplan and Gail E. Kaiser, Garp: Graph Abstractions for Concurrent Programming, H. Ganzinger (ed.),European Symposium on Programming, Vol. 300 ofLecture Notes in Comput. Sci. Heidelberg, Springer-Verlag, pp. 191–205 (March 1988).Google Scholar
  27. 27.
    D. A. Bailey and J. E. Cuny, Graph Grammar Based Specification of Interconnection Structures for Massively Parallel Computation,Proc. of the Third Int'l Workshop on Graph Grammars, pp. 73–85 (1987).Google Scholar
  28. 28.
    J. Magee, J. Kramer, and M. Sloman, Constructing Distributed Systems in Conic,IEEE Trans. on Software Engineering SE15(6):663–675 (June 1989).Google Scholar
  29. 29.
    D. A. Bailey and J. E. Cuny,Visual Extensions to Parallel Programming Languages, MIT Press, pp. 17–36 (August 1989).Google Scholar
  30. 30.
    P. A. Nelson and L. Snyder, Programming Paradigms for Nonshared Memory Parallel Computers,The Characteristics of Parallel Algorithms, The MIT Press, pp. 3–20 (1987).Google Scholar
  31. 31.
    L. Snyder and D. Socha, Poker on the Cosmic Cube: the First Retargetable Parallel Programming Language and Environment,Proc. Int's Conf. on Parallel Proc., pp. 628–635 (August 1986).Google Scholar
  32. 32.
    L. Snyder,The XYZ Abstraction Levels of Poker-like Languages, MIT Press, pp. 470–489 (August 1989).Google Scholar
  33. 33.
    W. G. Griswold, G. A. Harrison, D. Notkin, and L. Snyder, Part Ensembles: A Communication Abstraction for Nonshared Memory Parallel Programming, Technical Report, Department of Computer Science, University of Washington (1989).Google Scholar
  34. 34.
    J. Vuillemin, A Data Structure for Manipulating Priority Queues,Commun. of the ACM 21(4):309–315 (April 1987).Google Scholar
  35. 35.
    V. M. Lo, S. Rajopadhy, S. Gupta, D. Keldsen, M. A. Mohamed, and J. Telle, Mapping Divide-and-conquer Algorithms to Parallel Architectures,Proc. IEEE Int'l Conf. on Parallel Proc., Vol. III, pp. 128–135 (August 1990). (Also available as University of Oregon Technical Report CIS-TR-89-19.)Google Scholar
  36. 36.
    X. X. Zhong, S. Rajopadhye, and V. M. Lo, Parallel Implementation of Divide-andconquer Algorithms on Biniary Debruijn Networks, Technical Report CIS-TR-91-21, University of Oregon (1991). (To appear inSixth Int'l Parallel Processing Symp.)Google Scholar
  37. 37.
    H. Wielandt,Finite Permutation Groups, Academic Press (1964).Google Scholar
  38. 38.
    M. Fellows, Problem Corner,Contemporary Mathematics 89: 187–188 (1989).Google Scholar
  39. 39.
    S. B. Akers and B. Krishnamurthy, A Group-theoretic Model for Symmetric Interconnection Network,IEEE Trans. on Comput. C-38(4):555–566 (April 1989).Google Scholar
  40. 40.
    V. M. Lo, Algorithms for Static Assignment and Symmetric Contraction in Distributed Computing Systems,Proc. IEEE Int'l Conf. on Parallel Proc., pp. 239–244 (August 1988).Google Scholar
  41. 41.
    X. X. Zhong and V. M. Lo, Application Specific Deadlock Free Wormhole Routing on Multicomputers, Technical Report CIS-TR-92-03, University of Oregon (1992). To appear inPARLE 92.Google Scholar
  42. 42.
    X. X. Zhong and V. M. Lo, An Efficient Heuristic for Applications Specific Routing on Mesh Connected Multiprocessors, Technical Report CIS-TR-92-04, University of Oregon (1992). (To appear in 1992Int'l Conf. on Parallel Processing.)Google Scholar
  43. 43.
    C. L. Seitz and W. J. Dally, Deadlock-free Message Routing in Multiprocessor Interconnection Networks,IEEE Trans. on Comput. 36(5):547–553 (May 1987).Google Scholar
  44. 44.
    Y. Han and R. Finkel, An Optimal Scheme for Disseminating Information,Proc. of the Int'l Conf. on Parallel Proc., pp. 198–203 (August 1988).Google Scholar
  45. 45.
    R. H. Campbell and A. N. Habermann,The Specification of Process Synchronization by Path Expressions, Springer-Verlag16:89–102 (1974).Google Scholar
  46. 46.
    S. L. Johnsson, Communication in Network Architectures.VLSI and Parallel Computation, Morgan Kaufmann Publishers, Inc., p. 290 (1990).Google Scholar
  47. 47.
    V. M. Lo, S. Rajopadhye, S. Gupta, D. Keldsen, M. A. Mohamed, and J. Telle, OREGAMI: Software Tools for Mapping Parallel Algorithms to Parallel Architectures,Proc. Int'l Conf. on Parallel Proc., Vol. II, pp. 88–92 (August 1990). Updated version available as University of Oregon Technical Report CIS-TR-89-18a.Google Scholar
  48. 48.
    V. M. Lo, S. Rajopadhye, M. A. Mohamed, S. Gupta, B. Nitzberg, J. A. Telle, and X. X. Zhong, LaRCS: A Language for Describing Parallel Computations for the Purpose of Mapping, Technical Report CIS-TR-90-16. University of Oregon Department of Computer Science (1990). (To appear inIEEE Trans. on Parallel and Distrib. Syst.)Google Scholar
  49. 49.
    R. Rowley and B. Bose, On Necklaces in Shuffle-exchange and DeBruijn Networks,Proc. Int'l Conf. on Parallel Proc., Vol. I, pp. 347–350 (August 1990).Google Scholar

Copyright information

© Plenum Publishing Corporation 1991

Authors and Affiliations

  • Virginia M. Lo
    • 1
  • Sanjay Rajopadhye
  • Samik Gupta
  • David Keldsen
  • Moataz A. Mohamed
  • Bill Nitzberg
  • Jan Arne Telle
  • Xiaoxiong Zhong
  1. 1.Department of Computer and Information ScienceUniversity of OregonEugene

Personalised recommendations