Staggered scheme: A loop allocation policy
The run-time overhead of detection and allocation of dynamic parallelism in a program can easily offset the performance gain. To improve the performance and reduce run-time overhead, it would be necessary to develop an allocation scheme that detects dynamic parallelism during compile-time. However, the difficult task of accurate estimation of the run-time parallelism is a stumbling block to this direction. As a compromise, we propose an allocation policy which: (i) detect dynamic parallelism for a selected group of program constructs during compile-time and, (ii) allocates them to the estimated hardware resources in a staggered fashion using a set of heuristic rules.
Unable to display preview. Download preview PDF.
- 1.Cytron, R., “DOACROSS: Beyond Vectorization for Multiprocessors,” Parallel Processing Conference, 1986, pp. 836–844.Google Scholar
- 2.Dunigan, T. H., “Performance of the Intel iPSC/860 and Ncube 6400 Hypercubes,” Parallel Computing, Vol. 17, 1991, pp. 1285–1302.Google Scholar
- 4.Hurson, A. R., Lim, J. T., Kavi, K., and Shirazi, B., “Loop Allocation Scheme for Multithreaded Dataflow Computers,” Parallel Processing Symposium, 1994.Google Scholar
- 5.Lee, B. and Hurson, A. R., “Issues in Dataflow Computing,” Advances in Computers, Vol. 37, 1993, pp. 285–333.Google Scholar
- 7.Lim, J. T., Hurson, A. R., Lee, B., and Shirazi, B., “Staggered Distribution: A Loop Allocation Scheme for Dataflow Multiprocessor Systems,” Frontiers of Massively Parallel Computation, 1992, pp. 310–317.Google Scholar