LALP: A Language to Program Custom FPGA-Based Acceleration Engines

  • Ricardo MenottiEmail author
  • João M. P. Cardoso
  • Marcio M. Fernandes
  • Eduardo Marques


Field-Programmable Gate Arrays (FPGAs) are becoming increasingly important in embedded and high-performance computing systems. They allow performance levels close to the ones obtained with Application-Specific Integrated Circuits, while still keeping design and implementation flexibility. However, to efficiently program FPGAs, one needs the expertise of hardware developers in order to master hardware description languages (HDLs) such as VHDL or Verilog. Attempts to furnish a high-level compilation flow (e.g., from C programs) still have to address open issues before broader efficient results can be obtained. Bearing in mind an FPGA available resources, it has been developed LALP (Language for Aggressive Loop Pipelining), a novel language to program FPGA-based accelerators, and its compilation framework, including mapping capabilities. The main ideas behind LALP are to provide a higher abstraction level than HDLs, to exploit the intrinsic parallelism of hardware resources, and to allow the programmer to control execution stages whenever the compiler techniques are unable to generate efficient implementations. Those features are particularly useful to implement loop pipelining, a well regarded technique used to accelerate computations in several application domains. This paper describes LALP, and shows how it can be used to achieve high-performance computing solutions.


Loop pipelining Compilers Reconfigurable computing FPGA 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Densmore D., Passerone R., Sangiovanni-Vincentelli A.: A platform-based taxonomy for ESL design. IEEE Des. Test Comput. 23(5), 359–374 (2006)CrossRefGoogle Scholar
  2. 2.
    Edwards S.A.: The challenges of synthesizing hardware from C-like languages. IEEE Des. Test Comput. 23(5), 375–386 (2006)CrossRefGoogle Scholar
  3. 3.
    Allan V.H., Jones R.B., Lee R.M., Allan S.J.: Software pipelining. ACM Comput. Surv. 27(3), 367–432 (1995)CrossRefGoogle Scholar
  4. 4.
    Callahan T.J., Hauser J.R., Wawrzynek J.: The GARP architecture and C compiler. Computer 33(4), 62–69 (2000)CrossRefGoogle Scholar
  5. 5.
    Haldar, M., Nayak, A., Choudhary, A., Banerjee, P.: A system for synthesizing optimized FPGA hardware from matlab. In: ICCAD ’01: Proceedings of the 2001 IEEE/ACM International Conference on Computer-Aided Design, pp. 314–319. IEEE Press, Piscataway, NJ, USA (2001)Google Scholar
  6. 6.
    Snider, G.: Performance-constrained pipelining of software loops onto reconfigurable hardware. In: FPGA ’02: Proceedings of the 2002 ACM/SIGDA Tenth International Symposium on Field-Programmable Gate Arrays, pp. 177–186. ACM Press, New York, NY, USA (2002)Google Scholar
  7. 7.
    Ramakrishna Rau, B.: Iterative modulo scheduling: an algorithm for software pipelining loops. In: MICRO 27: Proceedings of the 27th Annual International Symposium on Microarchitecture, pp. 63–74. ACM Press, New York, NY, USA (1994)Google Scholar
  8. 8.
    Aiken, A., Nicolaum, A.: Perfect pipelining: a new loop parallelization technique. In: ESOP ’88: Proceedings of the 2nd European Symposium on Programming, pp. 221–235. Springer, London, UK (1988)Google Scholar
  9. 9.
    Weinhardt M., Luk W.: Pipeline vectorization. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 20, 234–248 (2001)CrossRefGoogle Scholar
  10. 10.
    Gupta S., Gupta R., Dutt N., Nicolau A.: SPARK: A Parallelizing Approach to the High-Level Synthesis of Digital Circuits. Kluwer, Dordrecht (2004)Google Scholar
  11. 11.
    Altera Corporation: Nios II C2H compiler user guide (2009).
  12. 12.
    Menotti, R., Cardoso, J.M.P., Fernandes, M.M., Eduardo, M.: Automatic generation of FPGA hardware accelerators using a domain specific language. In: International Conference on Field Programmable Logic and Applications (FPL), pp. 457–461. (2009)Google Scholar
  13. 13.
    Menotti, R., Manuel J.M.P., Fernandes, M.M., Eduardo, M.: LALP: a novel language to program custom FPGA-based architectures. In: Proceedings of the 21st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 3–10. IEEE Computer Society Press, Los Alamitos, CA, USA (2009)Google Scholar
  14. 14.
    Cardoso, J.M.P.: Dynamic loop pipelining in data-driven architectures. In: CF ’05: Proceedings of the 2nd Conference on Computing Frontiers, pp. 106–115. ACM Press, New York, NY, USA (2005)Google Scholar
  15. 15.
    Rodrigues, R., Cardoso, JM.P., Diniz, P.C.: A data-driven approach for pipelining sequences of data-dependent loops. In: FCCM ’07: Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 219–228. IEEE Computer Society, Washington, DC, USA (2007)Google Scholar
  16. 16.
    Menotti, R., Marques, E., Cardoso, J.M.P.: Aggressive loop pipelining for reconfigurable architectures. In: International Conference on Field Programmable Logic and Applications (FPL), pp. 501–502 (2007)Google Scholar
  17. 17.
    AT&T Research: Graphviz: Graph visualization software (2011).
  18. 18.
    Aspvall B., Plass M.F., Tarjan R.E.: A linear-time algorithm for testing the truth of certain quantified boolean formulas* 1. Inf. Process. Lett. 8(3), 121–123 (1979)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Muchnick S.: Advanced Compiler Design Implementation. Morgan Kaufmann, Los Altos, CA (1997)Google Scholar
  20. 20.
    Ramalingam G.: On loops, dominators, and dominance frontiers. ACM Trans. Program. Lang. Syst. 24(5), 455–490 (2002)CrossRefGoogle Scholar
  21. 21. C-to-Verilog (2009).
  22. 22.
    Buyukkurt, B., Guo, Z., Najjar, W.A.: Impact of loop unrolling on area, throughput and clock frequency in ROCCC: C to VHDL compiler for FPGAs. In: Proceedings of the International Workshop on Applied Reconfigurable Computing (ARC2006) (2006)Google Scholar
  23. 23.
    Menotti, R., Cardoso, J.M.P., Fernandes, M.M., Marques, E.: On using LALP to map an audio encoder/decoder on FPGAs. In: Proceedings of the 2010 IEEE International Symposium on Industrial Electronics (2010). To be publishedGoogle Scholar
  24. 24.
    Austin T., Larson E., Ernst D.: SimpleScalar: an infrastructure for computer system modeling. Computer 35(2), 59–67 (2002)CrossRefGoogle Scholar
  25. 25.
    SimpleScalar LLC: SimpleScalar (2011).
  26. 26.
    Altera Corporation: Nios II processor reference handbook (2011).
  27. 27.
    Gokhale M.B., Stone J.M., Gomersall E.: Co-synthesis to a hybrid RISC/FPGA architecture. J. VLSI Signal Process. Syst. 24, 165–180 (2000)CrossRefGoogle Scholar
  28. 28.
    Budiu, M., Goldstein, S.C.: Fast compilation for pipelined reconfigurable fabrics. In: FPGA ’99: Proceedings of the 1999 ACM/SIGDA Seventh International Symposium on Field Programmable Gate Arrays, pp. 195–205. ACM, New York, NY, USA (1999)Google Scholar
  29. 29.
    Cronquist, D.C., Franklin, P., Berg, S.G., Ebeling, C.: Specifying and compiling applications for RaPiD. In: FCCM ’98: Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 116. IEEE Computer Society, Washington, DC, USA (1998)Google Scholar
  30. 30.
    Coutinho, J.G.F., Jiang, J., Luk, W.: Interleaving behavioral and cycle-accurate descriptions for reconfigurable hardware compilation. In: FCCM ’05: Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 245–254. IEEE Computer Society, Washington DC, USA (2005)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Ricardo Menotti
    • 1
    Email author
  • João M. P. Cardoso
    • 2
  • Marcio M. Fernandes
    • 1
  • Eduardo Marques
    • 3
  1. 1.Departamento de ComputaçãoUniversidade Federal de São CarlosSão CarlosBrazil
  2. 2.Departamento de Engenharia Informática, Faculdade de EngenhariaUniversidade do PortoPortoPortugal
  3. 3.Inst. Ciências Matemáticas e ComputaçãoUniversidade de São PauloSão CarlosBrazil

Personalised recommendations