Exploiting Outer Loops Vectorization in High Level Synthesis

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9017)

Abstract

Synthesis of DoAll loops is a key aspect of High Level Synthesis since they allow to easily exploit the potential parallelism provided by programmable devices. This type of parallelism can be implemented in several ways: by duplicating the implementation of body loop, by exploiting loop pipelining or by applying vectorization.

In this paper a methodology for the synthesis of complex DoAll loops based on outer vectorization is proposed. Vectorization is not limited to the innermost loops: complex constructs such as nested loops, conditional constructs and function calls are supported. Experimental results on parallel benchmarks show up to 7.35x speed-up and up to 40 % reduction of area-delay product.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altera: Quartus II (2013). http://www.altera.com
  2. 2.
    Choi, J., Brown, S., Anderson, J.: From software threads to parallel hardware in high-level synthesis for fpgas. In: FPT ’13, pp. 270–277, December 2013Google Scholar
  3. 3.
    Cilardo, A., Gallo, L., Mazzocca, N.: Design space exploration for high-level synthesis of multi-threaded applications. Journal of Systems Architecture 59(10, Part D), 1171–1183 (2013)CrossRefGoogle Scholar
  4. 4.
    Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for fpgas: From prototyping to deployment. IEEE TCAD 30(4), 473–491 (2011)Google Scholar
  5. 5.
    Cong, J., Jiang, W.: Pattern-based behavior synthesis for fpga resource reduction. In: FPGA 2008, pp. 107–116. ACM, New York (2008)Google Scholar
  6. 6.
    Fingeroff, M.: High-Level Synthesis Blue Book. Xlibris Corporation (2010)Google Scholar
  7. 7.
    Gupta, S., Savoiu, N., Kim, S., Dutt, N., Gupta, R., Nicolau, A.: Speculation techniques for high level synthesis of control intensive designs. In: DAC 2001, pp. 269–272. ACM, New York (2001)Google Scholar
  8. 8.
    Hadjis, S., Canis, A., Anderson, J.H., Choi, J., Nam, K., Brown, S., Czajkowski, T.: Impact of fpga architecture on resource sharing in high-level synthesis. In: FPGA 2012, pp. 111–114. ACM, New York (2012)Google Scholar
  9. 9.
    Kurra, S., Singh, N.K., Panda, P.R.: The impact of loop unrolling on controller delay in high level synthesis. In: DATE ’07, pp. 391–396 (2007)Google Scholar
  10. 10.
    Mahlke, S.A., Lin, D.C., Chen, W.Y., Hank, R.E., Bringmann, R.A.: Effective compiler support for predicated execution using the hyperblock. SIGMICRO Newsl. 23(1–2), 45–54 (1992)CrossRefGoogle Scholar
  11. 11.
    Morvan, A., Derrien, S., Quinton, P.: Polyhedral bubble insertion: A method to improve nested loop pipelining for high-level synthesis. IEEE TCAD 32(3), 339–352 (2013)Google Scholar
  12. 12.
    Naishlos, D.: Autovectorization in GCC. In: GCC Developers Summit, pp. 105–118 (2004)Google Scholar
  13. 13.
    Nuzman, D., Zaks, A.: Outer-loop vectorization: revisited for short simd architectures. In: PACT 2008, pp. 2–11. ACM, New York (2008). http://doi.acm.org/10.1145/1454115.1454119
  14. 14.
    OpenMP: Application Program Interface, version 4.0, July 2013Google Scholar
  15. 15.
    Papakonstantinou, A., Gururaj, K., Stratton, J.A., Chen, D., Cong, J., Hwu, W.M.W.: Efficient compilation of cuda kernels for high-performance computing on fpgas. ACM TECS 13(2), 1–26 (2013)CrossRefGoogle Scholar
  16. 16.
    Pilato, C., Ferrandi, F.: Bambu: A modular framework for the high level synthesis of memory-intensive applications. In: FPL 2013, pp. 1–4, September 2013Google Scholar
  17. 17.
    Raghunathan, V., Raghunathan, A., Srivastava, M., Ercegovac, M.: High-level synthesis with simd units. In: ASP-DAC 2002, pp. 407–413 (2002)Google Scholar
  18. 18.
    Xilinx: Vivado Design Suite (2013). http://www.xilinx.com
  19. 19.
    Zuo, W., Liang, Y., Li, P., Rupnow, K., Chen, D., Cong, J.: Improving high level synthesis optimization opportunity through polyhedral transformations. In: FPGA 2013, pp. 9–18. ACM, New York (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Dipartimento di Elettronica, Informazione e BioingegneriaPolitecnico di MilanoMilanItaly

Personalised recommendations