OREGAMI: Tools for mapping parallel computations to parallel architectures
- 72 Downloads
- 20 Citations
Abstract
The OREGAMI project involves the design, implementation, and testing of algorithms for mapping parallel computations to message-passing parallel architectures. OREGAMI addresses the mapping problem by exploiting regularity and by allowing the user to guide and evaluate mapping decisions made by OREGAMI's efficient combinatorial mapping algorithms. OREGAMI's approach to mapping is based on a new graph theoretic model of parallel computation called the Temporal Communication Graph. The OREGAMI software tools include three components: (1) LaRCS is a graph description language which allows the user to describe regularity in the communication topology as well as the temporal communication behavior (the pattern of message-passing over time). (2) MAPPER is our library of mapping algorithms which utilize information provided by LaRCS to perform contraction, embedding, and routing. (3) METRICS is an interactive graphics tool for display and analysis of mappings. This paper gives an overview of the OREGAMI project, the software tools, and OREGAMI's mapping algorithms.
Key Words
Mapping routing embedding task assignement regular parallel computations parallel programming environmentsPreview
Unable to display preview. Download preview PDF.
References
- 1.M. Rosing, R. B. Schnabel, and R. P. Weaver, The Dino Parallel Programming Language, Technical Report CU-CS-457-90, Department of Computer Science, University of Colorado at Boulder (April 1990).Google Scholar
- 2.M. H. Coffin, Par: An Approach to Architecture-independent Parallel Programming, Technical Report TR90-28, Department of Computer Science, University of Arizona (August 1990).Google Scholar
- 3.H. S. Stone, Multiprocessor Scheduling with the Aid of Network Flow Algorithms,IEEE Trans. on Software Engineering SE-3(1):85–93 (January 1977).Google Scholar
- 4.V. M. Lo, Temporal Communication Graphs: Lamport's Process-time Graphs Augmented for the Purpose of Mapping and Scheduling, Technical Report CIS-TR-92-04, University of Oregon (1992) (To appearJ. Parallel and Distrib. Computing.)Google Scholar
- 5.L. Lamport, Time, Clocks, and the Ordering of Events in a Distributed System.Comm. of the ACM 21(7):558–565 (July 1978).Google Scholar
- 6.C. L. Seitz, The Cosmic Cube,Comm. of the ACM 28(1):22–33 (January 1985).Google Scholar
- 7.C. D. Polychronopoulos,Parallel Programming and Compilers, Kluwer Academic Publishers (1988).Google Scholar
- 8.S. H. Bokhari,Assignment Problems in Parallel and Distributed Computing, Kluwer Academic Publishers (1987).Google Scholar
- 9.F. Berman and B. Stramm, Prep-P: Evolution and Overview, Technical Report CS89-158, Department of Computer Science, University of California at San Diego (1989).Google Scholar
- 10.V. M. Lo: Heuristic Algorithms for Task Assignment in Distributed Systems,IEEE Trans. on Comput. 37(11):1384–1397 (1988).Google Scholar
- 11.P. Sadayappan, F. Ercal, and J. Ramanujam, Clustering Partitioning Approaches to Mapping Parallel Programs onto a Hypercube,Parallel Computing 13:1–16 (1990).Google Scholar
- 12.J. C. Browne, Framework for Formulation and Analysis of Parallel Computation Structures,Parallel Computing 3:1–9 (1986).Google Scholar
- 13.F. Berman, Experience with an Automatic Solution to the Mapping Problem,The Characteristics of Parallel Algorithms, The MIT Pres, pp. 307–334 (1987).Google Scholar
- 14.L Snyder, Introduction to the Configurable, Highly Parallel Computer,Computer 15(1):47–56 (January 1982).Google Scholar
- 15.A. Wagner, S. Chanson, N. Goldstein, J. Jiang, H. Larsen, and H. Sreekantaswamy, TIPS: Transputer-based Interactive Parallelizing System, Technical Report, Department of Computer Science, University of British Columbia (1990).Google Scholar
- 16.J. C. Browne, Code: A Unified Approach to Parallel Programming.IEEE Software 6(4):10–19 (July 1989).Google Scholar
- 17.H. El-Rewini and T. G. Lewis, Scheduling Parallel Program Tasks onto Arbitrary Target Machines.J. of Parallel and Distrib. Computing 9:138–153 (1990).Google Scholar
- 18.C. D. Polychronopoulos,Parallel Programming and Compilers, Kluwer Academic Publishers (1988).Google Scholar
- 19.V. Sakar, Partitioning and Scheduling Parallel Programs for Execution on Multiprocessors, Technical Report, Ph.D. Thesis, Department of Computer Science, Stanford University (1987).Google Scholar
- 20.A. L. Rosenberg, Graph Embeddings 1988: Recent Breakthroughs New Directions, Technical Report 88-28, University of Massachusetts at Amherst (March 1988).Google Scholar
- 21.F. Berman and L. Snyder, On Mapping Parallel Algorithms into Parallel Architectures,J. of Parallel and Distrib. Computing 4(5):439–458 (October 1987).Google Scholar
- 22.S. V. Rajopadhye and R. M. Fujimoto, Synthesizing Systolic Arrays from Recurrence Equations,Parallel Computing 14:163–189 (June 1990).Google Scholar
- 23.Marina C. Chen, A Design Methodology for Synthesizing Parallel Algorithms and Architectures,J. of Parallel and Distrib. Computing 3(6):461–491 (December 1986).Google Scholar
- 24.D. D. Kandlur and K. G. Shin, Traffic Routing for Multi-computer Networks with Virtual Cut-through Capability,Proc. of the 10th Int'l Conf. on Distrib. Comput. Syst., pp. 398–405 (May 1990).Google Scholar
- 25.B. P. Bianchini and J. P. Shen, Interprocessor Traffic Scheduling Algorithm for Multiprocessor Network,IEEE Trans. on Comput. C-36(4):396–409 (April 1987).Google Scholar
- 26.Simon M. Kaplan and Gail E. Kaiser, Garp: Graph Abstractions for Concurrent Programming, H. Ganzinger (ed.),European Symposium on Programming, Vol. 300 ofLecture Notes in Comput. Sci. Heidelberg, Springer-Verlag, pp. 191–205 (March 1988).Google Scholar
- 27.D. A. Bailey and J. E. Cuny, Graph Grammar Based Specification of Interconnection Structures for Massively Parallel Computation,Proc. of the Third Int'l Workshop on Graph Grammars, pp. 73–85 (1987).Google Scholar
- 28.J. Magee, J. Kramer, and M. Sloman, Constructing Distributed Systems in Conic,IEEE Trans. on Software Engineering SE15(6):663–675 (June 1989).Google Scholar
- 29.D. A. Bailey and J. E. Cuny,Visual Extensions to Parallel Programming Languages, MIT Press, pp. 17–36 (August 1989).Google Scholar
- 30.P. A. Nelson and L. Snyder, Programming Paradigms for Nonshared Memory Parallel Computers,The Characteristics of Parallel Algorithms, The MIT Press, pp. 3–20 (1987).Google Scholar
- 31.L. Snyder and D. Socha, Poker on the Cosmic Cube: the First Retargetable Parallel Programming Language and Environment,Proc. Int's Conf. on Parallel Proc., pp. 628–635 (August 1986).Google Scholar
- 32.L. Snyder,The XYZ Abstraction Levels of Poker-like Languages, MIT Press, pp. 470–489 (August 1989).Google Scholar
- 33.W. G. Griswold, G. A. Harrison, D. Notkin, and L. Snyder, Part Ensembles: A Communication Abstraction for Nonshared Memory Parallel Programming, Technical Report, Department of Computer Science, University of Washington (1989).Google Scholar
- 34.J. Vuillemin, A Data Structure for Manipulating Priority Queues,Commun. of the ACM 21(4):309–315 (April 1987).Google Scholar
- 35.V. M. Lo, S. Rajopadhy, S. Gupta, D. Keldsen, M. A. Mohamed, and J. Telle, Mapping Divide-and-conquer Algorithms to Parallel Architectures,Proc. IEEE Int'l Conf. on Parallel Proc., Vol. III, pp. 128–135 (August 1990). (Also available as University of Oregon Technical Report CIS-TR-89-19.)Google Scholar
- 36.X. X. Zhong, S. Rajopadhye, and V. M. Lo, Parallel Implementation of Divide-andconquer Algorithms on Biniary Debruijn Networks, Technical Report CIS-TR-91-21, University of Oregon (1991). (To appear inSixth Int'l Parallel Processing Symp.)Google Scholar
- 37.H. Wielandt,Finite Permutation Groups, Academic Press (1964).Google Scholar
- 38.M. Fellows, Problem Corner,Contemporary Mathematics 89: 187–188 (1989).Google Scholar
- 39.S. B. Akers and B. Krishnamurthy, A Group-theoretic Model for Symmetric Interconnection Network,IEEE Trans. on Comput. C-38(4):555–566 (April 1989).Google Scholar
- 40.V. M. Lo, Algorithms for Static Assignment and Symmetric Contraction in Distributed Computing Systems,Proc. IEEE Int'l Conf. on Parallel Proc., pp. 239–244 (August 1988).Google Scholar
- 41.X. X. Zhong and V. M. Lo, Application Specific Deadlock Free Wormhole Routing on Multicomputers, Technical Report CIS-TR-92-03, University of Oregon (1992). To appear inPARLE 92.Google Scholar
- 42.X. X. Zhong and V. M. Lo, An Efficient Heuristic for Applications Specific Routing on Mesh Connected Multiprocessors, Technical Report CIS-TR-92-04, University of Oregon (1992). (To appear in 1992Int'l Conf. on Parallel Processing.)Google Scholar
- 43.C. L. Seitz and W. J. Dally, Deadlock-free Message Routing in Multiprocessor Interconnection Networks,IEEE Trans. on Comput. 36(5):547–553 (May 1987).Google Scholar
- 44.Y. Han and R. Finkel, An Optimal Scheme for Disseminating Information,Proc. of the Int'l Conf. on Parallel Proc., pp. 198–203 (August 1988).Google Scholar
- 45.R. H. Campbell and A. N. Habermann,The Specification of Process Synchronization by Path Expressions, Springer-Verlag16:89–102 (1974).Google Scholar
- 46.S. L. Johnsson, Communication in Network Architectures.VLSI and Parallel Computation, Morgan Kaufmann Publishers, Inc., p. 290 (1990).Google Scholar
- 47.V. M. Lo, S. Rajopadhye, S. Gupta, D. Keldsen, M. A. Mohamed, and J. Telle, OREGAMI: Software Tools for Mapping Parallel Algorithms to Parallel Architectures,Proc. Int'l Conf. on Parallel Proc., Vol. II, pp. 88–92 (August 1990). Updated version available as University of Oregon Technical Report CIS-TR-89-18a.Google Scholar
- 48.V. M. Lo, S. Rajopadhye, M. A. Mohamed, S. Gupta, B. Nitzberg, J. A. Telle, and X. X. Zhong, LaRCS: A Language for Describing Parallel Computations for the Purpose of Mapping, Technical Report CIS-TR-90-16. University of Oregon Department of Computer Science (1990). (To appear inIEEE Trans. on Parallel and Distrib. Syst.)Google Scholar
- 49.R. Rowley and B. Bose, On Necklaces in Shuffle-exchange and DeBruijn Networks,Proc. Int'l Conf. on Parallel Proc., Vol. I, pp. 347–350 (August 1990).Google Scholar