A Model for Hardware Realization of Kernel Loops

  • Jirong Liao
  • Weng-Fai Wong
  • Tulika Mitra
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2778)


Hardware realization of kernel loops holds the promise of accelerating the overall application performance and is therefore an important part of the synthesis process. In this paper, we consider two important loop optimization techniques, namely loop unrolling and software pipelining that can impact the performance and cost of the synthesized hardware. We propose a novel model that accounts for various characteristics of a loop, including dependencies, parallelism and resource requirement, as well as certain high level constraints of the implementation platform. Using this model, we are able to deduce the optimal unroll factor and technique for achieving the best performance given a fixed resource budget. The model was verified using a compiler-based FPGA synthesis framework on a number of kernel loops. We believe that our model is general and applicable to other synthesis frameworks, and will help reduce the time for design space exploration.


Design Space Exploration Schedule Length Hardware Realization Software Pipeline Loop Unroll 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rau, B.R.: Iterative Modulo Scheduling. The International Journal of Parallel Processing 24 (1996)Google Scholar
  2. 2.
    Page, I., Luk, W.: Compiling OCCAM into FPGAs. In: Proceedings of the International Symposium on Field Programmable Logic (FPL 1991) (1991)Google Scholar
  3. 3.
    Rinker, R., et al.: An Automated Process for Compiling Dataflow Graphs into Reconfigurable Hardware. IEEE Transactions on VLSI Systems 9 (2001)Google Scholar
  4. 4.
    Goldstein, S.C., et al.: Piperench: A Reconfigurable Architecture and Compiler. IEEE Computer (2000)Google Scholar
  5. 5.
    Callahan, T., Hauser, J.R., Wawrzynek, J.: The Garp Architecture and C Compiler. IEEE Computer (2000)Google Scholar
  6. 6.
    Weinhardt, M.: Compilation and Pipeline Synthesis for Reconfigurable Architectures. In: Proceedings of the Reconfigurable Architecture Workshop (RAW) (1997)Google Scholar
  7. 7.
    Weinhardt, M., Luk, W.: Pipeline vectorization for reconfigurable systems. In: Proceedings of the IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (1999)Google Scholar
  8. 8.
    Babb, J., et al.: Parallelizing Applications into Silicon. In: Proceedings of the IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (1999)Google Scholar
  9. 9.
    Snider, G., Shackleford, B., Carter, R.J.: Attacking the Semantic Gap between Application Programming Languages and Configurable Hardware. In: Proceedings of ACM FPGA (2001)Google Scholar
  10. 10.
    Jones, A., et al.: PACT HDL: A C Compiler Targeting ASICs and FPGAs with Power and Performance Optimizations. In: Proceedings of International Conference on Compilers, Architecture. and Synthesis for Embedded Systems (CASES) (2002)Google Scholar
  11. 11.
    Schreiber, R.: High-Level Synthesis of Nonprogrammable Hardware Accelerators. In: Proceedings of the IEEE International Conference on Application Specific Systems, Architectures, and Processors (ASAP) (2000)Google Scholar
  12. 12.
    Sivaraman, M., Aditya, S.: Cycle-time Aware Architecture Synthesis of Custom Hardware Accelerator. In: Proceedings of International Conference on Compilers, Architecture. and Synthesis for Embedded Systems (CASES) (2002)Google Scholar
  13. 13.
    Mentor Graphics Inc.: Mentor Graphics Monet User’s Manual (1999) (release r42)Google Scholar
  14. 14.
    Derrien, S., Rajopadhye, S.: Loop Tiling for Reconfigurable Accelerators. In: Brebner, G., Woods, R. (eds.) FPL 2001. LNCS, vol. 2147, p. 398. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  15. 15.
    So, B., Hall, M.W., Diniz, P.C.: A Compiler Approach to Fast Hardware Design Space Exploration in FPGA-based Systems. In: Proceedings of the International Conference on Programming Language Design and Implementation (PLDI) (2002)Google Scholar
  16. 16.
    Kathail, V., Schlansker, M., Rau, B.: Hpl-pd architectural specifications: Version 1.1. Technical Report Technical Report HPL-93-80(R.1), Hewlett-Packard Laboratories (2000) revisedGoogle Scholar
  17. 17.
    Trimaran Consortium: Trimaran: An Infrastructure for Research in Instruction Level Parallelism,
  18. 18.
    Celoxica Inc.: Handel-C,
  19. 19.
    Electronic Industries Alliance: Electronic Design Interface Format,

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Jirong Liao
    • 1
  • Weng-Fai Wong
    • 1
  • Tulika Mitra
    • 1
  1. 1.Department of Computer Science, School of ComputingNational University of SingaporeSingapore

Personalised recommendations