Design Methodology for Offloading Software Executions to FPGA
Field programmable gate array (FPGA) is a flexible solution for offloading part of the computations from a processor. In particular, it can be used to accelerate an execution of a computationally heavy part of the software application, e.g., in DSP, where small kernels are repeated often. Since an application code for a processor is a software, a design methodology is needed to convert the code into a hardware implementation, applicable to the FPGA. In this paper, we propose a design method, which uses the Transport Triggered Architecture (TTA) processor template and the TTA-based Co-design Environment toolset to automate the design process. With software as a starting point, we generate a RTL implementation of an application-specific TTA processor together with the hardware/software interfaces required to offload computations from the system main processor. To exemplify how the integration of the customized TTA with a new platform could look like, we describe a process of developing required interfaces from a scratch. Finally, we present how to take advantage of the scalability of the TTA processor to target platform and application-specific requirements.
KeywordsApplication-specific integrated circuits Hardware accelerator Computer aided engineering System-on-a-chip Coprocessors Field programmable gate arrays
- 1.Patyk, T., Salmela, P., Pitkänen, T., & Takala, J. (2010). Design methodology for accelerating software executions with FPGA. In Proc. IEEE workshop signal process. syst., Cupertino, CA, USA, 6–8 Oct. 2010 (pp. 46–51).Google Scholar
- 2.Synopsys Inc. (2011). High-Level Synthesis with Synphony C Compiler, Mountain View, CA, USA (4 p.) [online]. Available: http://www.synopsys.com/Systems/BlockDesign/HLS/Pages/SynphonyC-Compiler.aspx. Accessed 17 July 2011.
- 3.Hoffman, A., Kogel, T., Nohl, A., Braun, G., Schliebusch, O., Wahlen, O., et al. (2001). A novel methodology for the design of application-specific instruction-set processors (ASIPs) using a machine description language. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(11), 1338–1354.CrossRefGoogle Scholar
- 5.Target Compiler Technologies (2008). IP designed | IP programmer, Leuven, Belgium (p. 4). [online]. Available: http://www.retarget.com. Accessed 17 July 2011.
- 6.Cong, J. (2008). A new generation of C-based synthesis tool and domain-specific computing. In Proc. IEEE int. soc. conf., Newport Beach, CA, USA, 17–20 Sept. 2008 (Vol. 6507, pp. 386–386).Google Scholar
- 7.Impulse Accelerated Technologies Inc. (2007). Accelerate C in FPGA Kirkland, WA, USA (p. 2) [online]. Available: http://www.impulsec.com. Accessed 17 July 2011.
- 8.Goering, R. (2006). Programmable logic: Startup moves binaries into FPGAs. EE Times.Google Scholar
- 9.CriticalBlue Ltd (2007). Cascade programmable application coprocessor generation, Pleasance Edinburgh, United Kingdom (p. 4) [online]. Available: http://www.criticalblue.com. Accessed 17 July 2011.
- 10.Mentor Graphics Corporation (2010). Catapult C synthesis datasheet, Wilsonville, OR, USA (p. 4) [online]. Available: http://www.mentor.com/catapult. Accessed 17 July 2011.
- 11.Forte Design Systems (2008). Cynthesizer TM the most productive path to silicon, San Jose, CA, USA (p. 2. [online]. Available: http://www.forteds.com/products/cynthesizer.asp. Accessed 17 July 2011.
- 12.ESNUG ELSE 06 Item 7 Subject: Mentor Catapult C (2006). [Online]. Available: http://www.deepchip.com/items/else06-07.html. Accessed 17 July 2011.
- 13.Reshadi, M., & Gajski, D. (2005). A cycle-accurate compilation algorithm for custom pipelined datapaths. In Proc. IEEE/ACM/IFIP int. conf. HW/SW codesign system synthesis, New York, NY, USA ,18–21 Sept. 2005 (pp. 21–26).Google Scholar
- 14.Corporaal, H. (1994). Design of transport triggered architectures. In Proc. 4th great lakes symp. design autom. high perf. VLSI syst., Notre Dame, IN, USA, 4–5 Mar. 1994 (pp. 130–135).Google Scholar
- 15.Jääskeläinen, P., Guzma, V., Clio, A., Pitkänen, T., & Takala, J. (2007). Codesign toolset for application-specific instruction-set processors. In Proc. SPIE multimedia mobile devices, San Jose, CA, USA, 29–30 Jan. 2007 (Vol. 6507, pp. 05070X–1–10).Google Scholar
- 16.Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C. S., Takala, J., & Martinez, J. I. (2010). Customized exposed datapath soft-core design flow with compiler support. In Proc. int. conf. field programmable logic and applications, Milano, Italy, 31 Aug.–8 Sept. 2010 (pp. 217–222).Google Scholar
- 17.TCE: TTA codesign environment (2011). [online]. Available: http://tce.cs.tut.fi. Accessed 17 July 2011.
- 19.Implementing AHB peripherals in logic tiles (2007). Application note 119 [online]. Available: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0119e/index.html. Accessed 17 July 2011.
- 20.Maemo by Nokia (2011). [online]. Available: http://maemo.org/. Accessed 17 July 2011.
- 21.AMBA open specifications (2011). [online]. Available: http://www.arm.com/products/system-ip/amba/amba-open-specifications.php. Accessed 17 July 2011.
- 22.Tremor by the Xiph.Org foundation (2006). [online]. Available: http://wiki.xiph.org/index.php/Tremor. Accessed 17 July 2011.
- 23.Scratchbox cross-compilation toolkit project (2011). [online]. Available: http://www.scratchbox.org/. Accessed 17 July 2011.