Fine-grain scheduling under resource constraints
Many present-day microprocessors have fine grain parallelism, be it in the form of a pipeline, of multiple functional units, or replicated processors. The efficient use of such architectures depends on the capability of the compiler to schedule the execution of the object code in such a way that most of the available hardware is in use while complying with data dependences. In the case of one simple loop, the schedule may be expressed as an affine form in the loop counter. The coefficient of the loop counter in the schedule is the initiation interval, and gives the mean rate at which loop bodies may be executed. The dependence constraints may be converted to linear inequalities in the coefficients of a closed form schedule, and then solved by classical linear programming algorithms. The resource constraints, however, translate to non-linear constraints. These constraints become linear if the initiation interval is known. This leads to a fast searching algorithm, in which the initiation interval is increased until a feasible solution is found.
Unable to display preview. Download preview PDF.
- [DGN92]Vincent H. Van Dongen, Guang R. Gao, and Qi Ning. A polynomial time method for optimal software pipelining. In Luc Bougé, Michel Cosnard, Yves Robert, and Denis Trystram, editors, Parallel Processing: CONPAR 92-VAPP V, pages 613–624, Springer, LNCS 634, June 1992.Google Scholar
- [Fea89]Paul Feautrier. Asymptotically efficent algorithms for parallel architectures. In M. Cosnard and C. Girault, editors, Decentralized System, pages 273–284, IFIP WG 10.3, North-Holland, December 1989.Google Scholar
- [Fea91]Paul Feautrier. Dataflow analysis of scalar and array references. Int. J. of Parallel Programming, 20(1):23–53, February 1991.Google Scholar
- [Fea92a]Paul Feautrier. Some efficient solutions to the affine scheduling problem, I, one dimensional time. Int. J. of Parallel Programming, 21(5):313–348, October 1992.Google Scholar
- [Fea92b]Paul Feautrier. Some efficient solutions to the affine scheduling problem, II, multidimensional time. Int. J. of Parallel Programming, 21(6):389–420, December 1992.Google Scholar
- [Fis84]J.A. Fisher. The VLIW machine: a multiprocessor for compiling scientific code. Computer, 45–53, July 1984.Google Scholar
- [GS92]Franco Gasperoni and Uwe Schwiegelshohn. Scheduling loops on parallel processors: a simple algorithm with close to optimum performance. In Luc Bougé, Michel Cosnard, Yves Robert, and Denis Trystram, editors, Parallel Processing: CONPAR 92-VAPP V, pages 625–636, Springer, LNCS 634, June 1992.Google Scholar
- [Lam88]Monica Lam. Software pipelining: an effective scheduling technique for VLIW machines. In Proc. of the SIGPLAN '88 Conf. on Programming Language Design and Implementation, pages 318–328, Atlanta, June 1988.Google Scholar
- [MQRS90]Christophe Mauras, Patrice Quinton, Sanjay Rajopadhye, and Yannick Saouter. Scheduling Affine Parameterized Recurrences by means of Variable Dependent Timing Functions. Technical Report 1204, INRIA, April 1990.Google Scholar
- [Qui87]Patrice Quinton. The systematic design of systolic arrays. In F. Fogelman, Y. Robert, and M. Tschuente, editors, Automata networks in Computer Science, pages 229–260, Manchester University Press, December 1987.Google Scholar
- [RF93]B. Ramakrishna Rau and Joseph A. Fisher. Instruction-level parallel processing: history, overview and perspective. The Journal of Supercomputing, 7:9–50, 1993.Google Scholar
- [RG81]B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high-performance scientific computing. In IEEE/ACM 14th Annual Microprogramming Workshop, October 1981.Google Scholar