Abstract
In this paper, we focus on system level-optimizations for automatic parallelization of nested loop on Reconfigurable Accelerators. Specifically, as off-chip bandwidth plays a major role in total performances for such implementations, we propose some partitioning techniques based on loop tiling which can take advantage of the hierarchically structured RA memory systems.
Supported in part by IFCPAR project 1802-1: CORCoP Compilation and Optimization for Reconfigurable Co-Processors
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Spyder Board x2 Manual Rev 1.1. FZI Website and http://www.x2e.de/.
R. Andonov, H. Bourzoufi, and S. Rajopadhye. Two-dimensional orthogonal tiling: from theory to practice. In International Conference on High Performance Computing (HiPC), 1996.
J. Bu and E.F. Depreterre P. Dewilde. A Design Methodology for Partitioning Systolic Arrays. In IEEE conference on Application Specific Array Processor, 1990.
L. Carter, J. Ferrante, S. Hummel, B. Alpern, and K. Gatlin. Hierarchical tiling: a methodology for high performance. In Technical Report CS-96-508 and University of California at San Diego, 1996.
S. Derrien, S. Rajopadhye, and S. Sur-Kolay. Combining Instruction and Loop Level Parrallelism for FPGAs. IRISA Research report N∘1376 and February 2001.
S. Derrien, S. Rajopadhye, and S. Sur-Kolay. Loop Tiling for Reconfigurable Accelerators. IRISA Research report.
S. Derrien, S. Rajopadhye, and S. Sur-Kolay. Optimal partitionning for FPGA based regular array implementations. In IEEE PARELEC’00, August 2000.
Uwe Eckhardt and Renate Merker. Co-Partitionning-A Method for Hardware/Software design for scalable Systolic Arrays. In Reconfigurable Architectures and ITPress, 1997.
J. Vuillemin et al. Programmable active memories: Reconfigurable systems comes of age. In IEEE Transaction on VLSI Systems, 1991.
K. Hogsted, L. Carter, and J. Ferrante. Selecting tile shape for minimal execution time. In ACM Symposium on Parallel Algorithms and Architectures, 1999.
D. Lavenier. FPGA Implementation of the k-means Clustering Algorithm for Hyper-Spectral Images. Los Alamos Unclassified Report 00-3079 and July 2000.
D. I. Moldovan and J. A.N. Forbes. Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays. In IEEE Transactions on Computers, January 1986.
P. Quinton. Automatic Synthesis of Systolic arrays from Recurrent Uniform Equations. In International Conference on Computer Architecture, pages 208–214, 1984.
L. Thiele J. Teich and L. Zhang. Scheduling of Partitioned Regular Algorithms on Processor Arrays with Contrained Resources. In International Conference on Application Specific Processor Arrays (ASAP), 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Derrien, S., Rajopadhye, S. (2001). Loop Tiling for Reconfigurable Accelerators. In: Brebner, G., Woods, R. (eds) Field-Programmable Logic and Applications. FPL 2001. Lecture Notes in Computer Science, vol 2147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44687-7_41
Download citation
DOI: https://doi.org/10.1007/3-540-44687-7_41
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42499-4
Online ISBN: 978-3-540-44687-3
eBook Packages: Springer Book Archive