Automatic hardware synthesis of nested loops using UET grids and VHDL
This paper considers the automatic synthesis of systolic architectures from nested loop algorithmic specifications. The high level input is given in the form of uniform dependence loops with unit dependencies and the target architecture is a multidimensional systolic array with unbounded number of cells. A complete methodology for the hardware synthesis of the resulting architecture, based on VHDL specifications, is presented. This methodology automatically detects all necessary computation and communication elements and produces optimal layouts. The theoretical framework of our method is based on the properties of the generalized UET grids. First, we calculate the optimal makespan for the generalized UET grids and then we establish the minimum number of systolic cells required to achieve the optimal makespan. The complexity of the proposed scheduling algorithm is completely independent of the size of the nested loop and depends only on its dimension, thus being the most efficient (in terms of complexity) known to us. All these methods were implemented and incorporated in an integrated software package which provides the designer with a powerful parallel design environment, from high level algorithmic specifications to lowlevel (i.e., actual layouts) optimal implementation.
Index termsUET grid index space optimal makespan optimal mapping number of systolic cells uniform unit dependence vectors VHDL based design automation
Unable to display preview. Download preview PDF.
- 1.Andronikos, T., Koziris, N., Tsiatsoulis, Z., Papakonstantinou, G., and Tsanakas, P. Lower Time and Processor Bounds for Efficient Mapping of Uniform Dependence Algorithms into Systolic Arrays. To appear in Journal of Parallel Algorithms and Applications. 10, 3–4, 1997.Google Scholar
- 2.Andronikos, T., Koziris, N., Papakonstantinou, G., and Tsanakas, P. Optimal Scheduling for UET/UET-UCT Generalized n-Dimensional Grid Task Graphs, submitted to Journal of Parallel and Distributed Computing.Google Scholar
- 3.Bampis, E.,.Delorme, C., and Konig, J.C. Optimal Schedules for d-D Grid Graphs with Communication Delays. Symposium on Theoretical Aspects of Computer Science (STACS96). Grenoble France 1996.Google Scholar
- 4.Koziris, N., Papakonstantinou, G., and Tsanakas, P. Automatic Loop Mapping and Partitioning into Systolic Architectures. Proceedings of the 5th Panhellenic Conference on Informatics, Dec. 1995, pp. 777–790, AthensGoogle Scholar
- 5.Koziris, N., Papakonstantinou, G., and Tsanakas, P. Optimal Time and Efficient Space Free Scheduling For Nested Loops. The Computer Journal. 39, 5, pp. 439–448, 1996.Google Scholar
- 6.Lamport, L. The Parallel Execution of DO loops. Commun. ACM, vol.17, no.2, pp. 83–93, Feb. 1974.Google Scholar
- 7.Lee, P.-Z. and Kedem, Z.M. Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays. IEEE Trans. Parallel Distrib. Syst., vol. 1, no. 1, pp. 64–76, Jan. 1990.Google Scholar
- 8.Moldovan, D.I. and Fortes, J.A.B. Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays. IEEE Trans. Comput., vol C-35, no 1, pp. 1–11, Jan. 1986Google Scholar
- 9.Moldovan, D.I. ADVIS: A Software Package for the Design of Systolic Arrrays, IEEE Trans. Computer Aided Design, vol CAD-6, no 1, pp. 33–40, Jan. 1987.Google Scholar
- 10.Shang, W. and Fortes, J.A.B., Time Optimal Linear Schedules for Algorithms with Uniform Dependencies, IEEE Trans. Comput., vol. 40, no. 6, pp. 723–742, June 1991.Google Scholar
- 11.Shang, W. and Fortes, J.A.B. On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays. IEEE Trans. Parallel Distrib. Syst., vol. 3, no. 3, May 1992.Google Scholar