Abstract
Heterogeneous computing plays an ever-increasing role in power-efficient, high-performance embedded systems for various data processing tasks, such as computer vision. One possibility to accelerate this kind of application is the usage of FPGAs as a co-processor for standard CPUs. Although hardware design is becoming easier by utilizing High-Level-Synthesis tools, the question of interfacing FPGAs and CPUs has yet to be completely solved. The Heterogeneous System Architecture (HSA) Foundation defines and publishes architecture neutral standards for heterogeneous systems and programming models. While compatible CPU, GPU and DSP designs exist, FPGA models have not been defined yet. This paper describes the IP library LibHSA, which greatly simplifies integration of domain specific FPGA acceleration into existing HSA compliant systems. It allows FPGA based accelerators to take immediate advantage of high-level language tool chains. Including user space memory access, low-latency task dispatch and other benefits of the HSA programming model. We will demonstrate LibHSA with a programmable image processor implementation on a Xilinx FPGA. The image processor supports low-level algorithms, e.g. Sobel, Median, Laplace, or Gaussian. Our results show that the LibHSA infrastructure greatly simplifies the effort integrating FPGAs and customized hardware into existing accelerator systems, runtimes and application software.
Similar content being viewed by others
References
de la Chevallerie, D., Korinth, J., Koch, A. (2016). Fflink: A Lightweight High-Performance Open-Source PCI Express Gen3 Interface for Reconfigurable Accelerators. SIGARCH Computer Architecture News, 43(4), 34–39.
Georgakoudis, G., Gillan, C., Hassan, A., Minhas, U.I., Spence, I.T.A., Tzenakis, G., Vandierendonck, H., Woods, R.F., Nikolopoulos, D.S., Shyamsundar, M., Barber, P., Russell, M., Bilas, A., Kaloutsakis, S., Giefers, H., Staar, P.W.J., Bekas, C., Horlock, N., Faloon, R., Pattison, C. (2016). Nanostreams Codesigned microservers for edge analytics in real time. SAMOS, pp. 180–187.
Glossner, J., Blinzer, P., Takala, J. (2015). HSA-Enabled DSPs and accelerators. In IEEE Global Conference on Signal and Information Processing (GlobalSIP) (pp. 1407–1411).
Guidi, G., Reggiani, E., Tucci, L.D., Durelli, G., Blott, M., Santambrogio, M. (2016). On How to Improve FPGA-Based Systems Design Productivity via SDAccel. In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (pp. 247–252).
Hennessy, J.L., & Patterson, D.A. (2011). Computer architecture fifth edition: a quantitative approach, 5th ed. San Francisco: Morgan Kaufmann Publishers Inc.
HSA Foundation. (2016). HSA Foundation Specification Version 1.1.
HSA Foundation. (2016). HSA Platform System Architecture Specification 1.1.
HSA Foundation. (2016). HSA Programmer Reference Manual Specification 1.1.
HSA Foundation. (2016). HSA Runtime Specification 1.1.1.
Jaaskelainen, P., de La Lama, C.S., Huerta, P., Takala, J.H. (2010). OpenCL-based Design Methodology for Application-Specific Processors. In International Conference on Embedded Computer Systems (pp. 223–230): IEEE.
Janik, I., Tang, Q., Khalid, M. (2015). An overview of Altera SDK for openCL A user perspective. In 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE) (pp. 559–564).
Heinrich, J. (1994). MIPS R4000 Microprocessor User’s Manual.
Kim, N.S., Chen, D., Xiong, J., Wen-mei, W.H. (2017). Heterogeneous computing meets Near-Memory acceleration and High-Level synthesis in the Post-Moore era. IEEE Micro, 37(4), 10–18.
Mukherjee, S., Sun, Y., Blinzer, P., Ziabari, A.K., Kaeli, D. (2016). A comprehensive performance analysis of HSA and openCL 2.0. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (pp. 183–193).
Radeon Technologies Group. (2017). Radeon’s next-generation Vega architecture. Tech. rep., Advanced Micro Devices (AMD).
Schmidt, M., Reichenbach, M., Fey, D. (2012). A Generic VHDL Template for 2D Stencil Code Applications on FPGAs. In 15th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops (pp. 180–187).
Segal, O., Nasiri, N., Margala, M., Vanderbauwhede, W. (2014). High level programming of FPGAs for HPC and data centric applications. In 2014 IEEE High Performance Extreme Computing Conference (HPEC) (pp. 1–3): IEEE.
Severance, A., & Lemieux, G.G.F. Embedded Supercomputing in FPGAs with the VectorBlox MXP Matrix Processor. CODES+ISSS ’13, IEEE Press, pp. 6:1–6:10.
Su, L.T. (2013). Architecting the future through heterogeneous computing. In IEEE International Solid-State Circuits Conference Digest of Technical Papers (pp. 8–11).
Wu, Q., Ha, Y., Kumar, A., Luo, S., Li, A., Mohamed, S. A heterogeneous platform with GPU and FPGA for power efficient high performance computing. In 2014 14th International Symposium on Integrated Circuits (ISIC) (2014) (pp. 220–223): IEEE.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Reichenbach, M., Holzinger, P., Häublein, K. et al. Heterogeneous Computing Utilizing FPGAs. J Sign Process Syst 91, 745–757 (2019). https://doi.org/10.1007/s11265-018-1382-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-018-1382-7