EXPLORER: Supporting run-time parallelization of DO-ACROSS loops on general networks of workstations
Performing runtime parallelization on general networks of workstations (NOWs) without special hardware or system software supports is very difficult, especially for DOACROSS loops. With the high communication overhead on NOWs, there is hardly any performance gain for runtime parallelization, due to the latter's large amount of messages for dependence detection, data accesses, and computation scheduling. In this paper, we introduce the EXPLORER system for runtime parallelization of DOACROSS and DOALL loops on general NOWs. EXPLORER hides the communication overhead on NOWs through multithreading — a facility supported in almost all workstations. A preliminary version of EXPLORER was implemented on a NOW consisting of eight DEC Alpha workstations connected through an Ethernet. The Pthread package was used to support multithreading. Experiments on synthetic loops showed speedups of up to 6.5 in DOACROSS loops and 7 in DOALL Loops.
KeywordsDOACROSS loops inspector/executor multithreading networks of workstations run-time parallelization
Unable to display preview. Download preview PDF.
- 1.T. E. Anderson, D. E. Culler, D. A. Patterson, “A case for networks of workstations: NOW,” IEEE Micro 94.Google Scholar
- 2.D. K. Chen, J. Torrelas, and P. C. Yew, “An efficient algorithm for the run-time parallelization of DOACROSS loops,” Proc. of Supercomputing 1994, pp.518–527, November 1994.Google Scholar
- 3.R. Das, M. Uysal, J. Saltz, and Y. S. Hwang, “Communication optimizations for irregular scientific computations on distributed memory architectures,” Jorunal of Parallel and Distributed Computing, 22(3), pp. 462–479, September 1994.Google Scholar
- 4.C. Fu and T. Yang, “Run-time compilation for parallel sparse matrix computations,” Proc. of the 10th ACM International Conference on Supercomputing, pp.237–244, May 1996.Google Scholar
- 5.C. Fu and T. Yang, “Efficient run-time support for irregular task computations with Mixed granularities,” Proc. of 10th International Parallel Processing Symposium — IPPS'96, pp.823–830, April 1996.Google Scholar