Abstract
Hardware realization of kernel loops holds the promise of accelerating the overall application performance and is therefore an important part of the synthesis process. In this paper, we consider two important loop optimization techniques, namely loop unrolling and software pipelining that can impact the performance and cost of the synthesized hardware. We propose a novel model that accounts for various characteristics of a loop, including dependencies, parallelism and resource requirement, as well as certain high level constraints of the implementation platform. Using this model, we are able to deduce the optimal unroll factor and technique for achieving the best performance given a fixed resource budget. The model was verified using a compiler-based FPGA synthesis framework on a number of kernel loops. We believe that our model is general and applicable to other synthesis frameworks, and will help reduce the time for design space exploration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rau, B.R.: Iterative Modulo Scheduling. The International Journal of Parallel Processing 24 (1996)
Page, I., Luk, W.: Compiling OCCAM into FPGAs. In: Proceedings of the International Symposium on Field Programmable Logic (FPL 1991) (1991)
Rinker, R., et al.: An Automated Process for Compiling Dataflow Graphs into Reconfigurable Hardware. IEEE Transactions on VLSI Systems 9 (2001)
Goldstein, S.C., et al.: Piperench: A Reconfigurable Architecture and Compiler. IEEE Computer (2000)
Callahan, T., Hauser, J.R., Wawrzynek, J.: The Garp Architecture and C Compiler. IEEE Computer (2000)
Weinhardt, M.: Compilation and Pipeline Synthesis for Reconfigurable Architectures. In: Proceedings of the Reconfigurable Architecture Workshop (RAW) (1997)
Weinhardt, M., Luk, W.: Pipeline vectorization for reconfigurable systems. In: Proceedings of the IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (1999)
Babb, J., et al.: Parallelizing Applications into Silicon. In: Proceedings of the IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (1999)
Snider, G., Shackleford, B., Carter, R.J.: Attacking the Semantic Gap between Application Programming Languages and Configurable Hardware. In: Proceedings of ACM FPGA (2001)
Jones, A., et al.: PACT HDL: A C Compiler Targeting ASICs and FPGAs with Power and Performance Optimizations. In: Proceedings of International Conference on Compilers, Architecture. and Synthesis for Embedded Systems (CASES) (2002)
Schreiber, R.: High-Level Synthesis of Nonprogrammable Hardware Accelerators. In: Proceedings of the IEEE International Conference on Application Specific Systems, Architectures, and Processors (ASAP) (2000)
Sivaraman, M., Aditya, S.: Cycle-time Aware Architecture Synthesis of Custom Hardware Accelerator. In: Proceedings of International Conference on Compilers, Architecture. and Synthesis for Embedded Systems (CASES) (2002)
Mentor Graphics Inc.: Mentor Graphics Monet User’s Manual (1999) (release r42)
Derrien, S., Rajopadhye, S.: Loop Tiling for Reconfigurable Accelerators. In: Brebner, G., Woods, R. (eds.) FPL 2001. LNCS, vol. 2147, p. 398. Springer, Heidelberg (2001)
So, B., Hall, M.W., Diniz, P.C.: A Compiler Approach to Fast Hardware Design Space Exploration in FPGA-based Systems. In: Proceedings of the International Conference on Programming Language Design and Implementation (PLDI) (2002)
Kathail, V., Schlansker, M., Rau, B.: Hpl-pd architectural specifications: Version 1.1. Technical Report Technical Report HPL-93-80(R.1), Hewlett-Packard Laboratories (2000) revised
Trimaran Consortium: Trimaran: An Infrastructure for Research in Instruction Level Parallelism, http://www.trimaran.org
Celoxica Inc.: Handel-C, http://www.celoxica.com/tech/handel-c/
Electronic Industries Alliance: Electronic Design Interface Format, http://www.edif.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liao, J., Wong, WF., Mitra, T. (2003). A Model for Hardware Realization of Kernel Loops. In: Y. K. Cheung, P., Constantinides, G.A. (eds) Field Programmable Logic and Application. FPL 2003. Lecture Notes in Computer Science, vol 2778. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45234-8_33
Download citation
DOI: https://doi.org/10.1007/978-3-540-45234-8_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40822-2
Online ISBN: 978-3-540-45234-8
eBook Packages: Springer Book Archive