SpExSim: assessing kernel suitability for C-based high-level hardware synthesis

Article
  • 33 Downloads

Abstract

We present SpExSim, a software tool for quickly surveying legacy code bases for kernels that could be accelerated by FPGA-based compute units. We specifically aim for low development effort by considering the use of C-based high-level hardware synthesis, instead of complex manual hardware designs. SpExSim not only exploits the spatially distributed model of computation commonly used on FPGAs, but can also model the effect of two different microarchitectures commonly used in C-to-hardware compilers, including pipelined architectures with modulo scheduling. The estimations have been validated against actual hardware generated by two current HLS tools.

Keywords

Reconfigurable computing FPGA Hardware acceleration High-level synthesis Estimation Legacy code 

References

  1. 1.
    Canis A, Choi J, Fort B, Lian R, Huang Q, Calagar N, Gort M, Qin JJ, Aldham M, Czajkowski T, Brown S, Anderson J (2013) From software to accelerators with LegUp high-level synthesis. In: 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). doi:10.1109/CASES.2013.6662524
  2. 2.
    Canis AC (2015) Legup: open-source high-level synthesis research framework. PhD thesis, University of TorontoGoogle Scholar
  3. 3.
    Cong J, Liu B, Neuendorffer S, Noguera J, Vissers K, Zhang Z (2011) High-level synthesis for fpgas: from prototyping to deployment. IEEE Trans Comput Aided Des Integr Circuits Syst 30(4):473–491CrossRefGoogle Scholar
  4. 4.
    da Silva B, Braeken A, D’Hollander EH, Touhafi A (2013) Performance modeling for fpgas: extending the roofline model with high-level synthesis tools. Int J Reconfig Comput 2013:7:7–7:7. doi:10.1155/2013/428078 Google Scholar
  5. 5.
    De Micheli G (1994) Synthesis and optimization of digital circuits. McGraw-Hill, New YorkGoogle Scholar
  6. 6.
    Hara Y, Tomiyama H, Honda S, Takada H (2009) Proposal and quantitative analysis of the CHStone benchmark program suite for practical C-based high-level synthesis. J Inf Process. doi:10.2197/ipsjjip.17.242 Google Scholar
  7. 7.
    Holland B, Nagarajan K, George AD (2009) Rat: Rc amenability test for rapid performance prediction. ACM Trans Reconfig Technol Syst 1(4):22:1–22:31. doi:10.1145/1462586.1462591 CrossRefGoogle Scholar
  8. 8.
    Huthmann J, Liebig B, Oppermann J, Koch A (2013) Hardware/software co-compilation with the nymble system. In: 2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)Google Scholar
  9. 9.
    Lange H, Wink T, Koch A (2011) MARC II: a parametrized speculative multi-ported memory subsystem for reconfigurable computers. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), 2011. IEEE, pp 1–6Google Scholar
  10. 10.
    Lattner C, Adve V (2004) LLVM: a compilation framework for lifelong program analysis and transformation. In: International Symposium on Code Generation and Optimization, CGO. doi:10.1109/CGO.2004.1281665
  11. 11.
    Nane R, Sima VM, Pilato C et al (2016) A survey and evaluation of fpga high-level synthesis tools. IEEE Trans Comput Aided Des Integr Circuits Syst. doi:10.1109/TCAD.2015.2513673 Google Scholar
  12. 12.
    Oppermann J, Koch A, Reuter-Oppermann M, Sinnen O (2016) ILP-based modulo scheduling for high-level synthesis. In: Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES ’16. ACM, pp 1:1–1:10. doi:10.1145/2968455.2968512
  13. 13.
    Pangrle BM, Gajski DD (1987) Design tools for intelligent silicon compilation. IEEE Trans Comput Aided Des Integr Circuits Syst 6(6):1098–1112. doi:10.1109/TCAD.1987.1270350 CrossRefGoogle Scholar
  14. 14.
    Park J, Diniz PC, Shayee KRS (2004) Performance and area modeling of complete fpga designs in the presence of loop transformations. IEEE Trans Comput 53(11):1420–1435. doi:10.1109/TC.2004.101 CrossRefGoogle Scholar
  15. 15.
    Putnam A, Caulfield AM, Chung ES, Chiou D, Constantinides K, Demme J, Esmaeilzadeh H, Fowers J, Gopal GP, Gray J, Haselman M, Hauck S, Heil S, Hormati A, Kim JY, Lanka S, Larus JR, Peterson E, Pope S, Smith A, Thong J, Xiao PY, Burger D (2014) A reconfigurable fabric for accelerating large-scale datacenter services. In: ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), IEEE Computer Society, Minneapolis, MN, USA, pp 13–24Google Scholar
  16. 16.
    Reagen B, Adolf R, Shao YS, Wei GY, Brooks D (2014) MachSuite: benchmarks for accelerator design and customized architectures. doi:10.1109/IISWC.2014.6983050
  17. 17.
    Sotomayor R, Sanchez LM, Blas JG, Calderon A, Fernandez J (2015) Aki: automatic kernel identification and annotation tool based on C++ attributes. In: Trustcom/BigDataSE/ISPA, 2015 IEEE, vol 3Google Scholar
  18. 18.
    Wang Z, He B, Zhang W, Jiang S (2016) A performance analysis framework for optimizing OpenCL applications on FPGAs. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp 114–125. doi:10.1109/HPCA.2016.7446058
  19. 19.
    Xilinx Inc. Zynq-7000 all programmable soc. http://www.xilinx.com/products/silicon-devices/soc/zynq-7000.html

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Embedded Systems and Applications GroupTechnische Universität DarmstadtDarmstadtGermany

Personalised recommendations