Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories
- 84 Downloads
The present paper provides a comprehensive study of the following problem. Consider algorithms which are designed for shared memory models of parallel computation (PRAMs) in which processors are allowed to have fairly unrestricted access patterns to the shared memory. Consider also parallel machines in which the shared memory is organized in modules where only one cell of each module can be accessed at a time. Problem. Give general fast simulations of these algorithms by these parallel machines.
Each of our solutions answers two basic questions. (1) How to initially distribute the logical memory addresses of the PRAM, to be simulated, among the physical locations of the simulating machine? (2) How to compute the physical location of a logical address during the simulation?
Randomization. The logical addresses are randomly distributed among the memory modules. This is done using universal hashing.
Copies. We keep copies of each logical address in several memory modules.
In a typical time cycle of the PRAM some number of memory requests has to be satisfied. As a primary objective, our simulations minimize the maximum number of memory requests which are assigned to the same module. Our solutions also optimize the following computational resources. They minimize the size of the physical memory, the time for computing the mapping from logical to physical addresses and the space for storing this mapping.
We discuss extensions of our solutions to various PRAMs and various shared memory parallel machines. Our solution is also applicable to synchronous distributed machines with no shared memory where the processors can communicate through a bounded degree network.
KeywordsShared Memory Parallel Machine Access Pattern Memory Module Memory Address
Unable to display preview. Download preview PDF.
- 1.Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer Algorithms. Reading, MA: Addison-Wesley 1974Google Scholar
- 2.Ajtai, M., Komlos, J., Szemeredi, E.: An O(n log n) sorting network. Proc. Fifteenth ACM Symposium on Theory of Computing, pp. 1–9, 1983Google Scholar
- 3.Awerbuch, B., Israeli, A., Shiloach, Y.: Efficient simulations of PRAM by Ultracomputer. (Preprint). Dept. of Computer Science, Technion, Haifa, Israel, 1983Google Scholar
- 4.Carmichael, R.D.: Groups of finite orders. Dover: DoverPublications 1956Google Scholar
- 5.Carter, J.L., Wegman, M.N.: Universal classes of hash functions. Proc. Nineth ACM Symposium on Theory of Computing, pp. 106–112, 1977Google Scholar
- 6.Even, S.: Graph Algorithms. Potomac, MD: Computer Science Press 1979Google Scholar
- 7.Goldschlager, L.M.: A Unified Approach to Models of Synchronous Parallel Machines. Proc. Tenth ACM Symposium on Theory of Computing, pp. 89–94, 1978Google Scholar
- 8.Gonnet, G.H.: Expected length of the longest probe sequence in hash code searching. JACM 28, 289–304 (1981)Google Scholar
- 9.Gottlieb, A., Grishman, R., Kruskal, C.P., McAuliffe, K.P., Rudolph, L., Snir, M.: The NYU Ultracomputer-Designing, a MIMD Shared Memory Parallel Machine. IEEE Trans. Comput. c-32, 175–189 (1983)Google Scholar
- 10.Kuck, D.J.: A survey of parallel machine organization and programming. Comput. Surveys 9, 29–59 (1977)Google Scholar
- 11.Lev, G., Pippenger, N., Valiant, J.G.: A fast parallel agorithm for routing in permuting networks. IEEE Trans. Comput. c-30, 93–100 (1981)Google Scholar
- 12.Pippenger, N.: Superconcentrators. SIAM J. Comput. 6, 298–304 (1977)Google Scholar
- 13.Rabin, M.O.: Probabilistic algorithms. In: Algorithms and Complexity, J.F. Traub (ed.). New York: Academic Press 1976Google Scholar
- 14.Reif, J., Valiant, L.J.: A logarithmic time sort for linear size networks. Proc. Fifteenth ACM Symp. Theory Comput. pp. 10–16, 1983Google Scholar
- 15.Schwartz, J.T.: Ultracomputers. ACM Trans. Progr. Lang. Syst. 2, 484–521 (1980)Google Scholar
- 16.Shiloach, Y., Vishkin, U.: Finding the maximum, merging and sorting in a parallel computation model. J. Algorithms 2, 88–102 (1981)Google Scholar
- 17.Upfal, E.: A probabilistic relation between desirable and feasible models of parallel computation. Proc. Sixteenth ACM Symp. Theory Comput. 1984 (To appear)Google Scholar
- 18.Vishkin, U.: Parallel-Design space Distributed — Implementation space (PDDI) general purpose computer. RC 9541, IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, 1982. To appear in Theoretical Computer Science)Google Scholar
- 19.Vishkin, U.: Implementation of simultaneous memory address access in models that forbid it. J. Algorithms 4, 45–50 (1983)Google Scholar
- 20.Vishkin, U.: An optimal parallel algorithm for selection. (Preprint, 1983)Google Scholar
- 21.Vishkin, U., Wigderson, A.: Dynamic parallel memories. Information and Control 56, 174–182 (1983)Google Scholar