An Approach for Performance Estimation of Hybrid Systems with FPGAs and GPUs as Coprocessors
This paper presents an approach for modeling the achievable speed-ups of FPGAs (Field Programmable Gate Arrays) or GPUs (Graphic Processing Units) as coprocessors in hybrid computing systems. The underlying computation model assumes that the coprocessors are separate devices and that their input and output data are transferred from and into the system’s memory. The model considers all overheads involved when (sub-)tasks are performed on a coprocessor instead of the CPU. By means of a sample application the validity of the model is checked against measured values. In addition, the theoretical maximum speed-ups of two hybrid systems compared to an optimal single core CPU implementation are approximated. Using penalty factor P SEQ as a measure to which degree a program cannot be fully parallelized due to data dependencies, a system with a Nvidia GTX 285 GPU achieves a speed-up of 2.7 times P SEQ , while for a single node of a Cray XD1 with a Xilinx Virtex4 LX160 the speed-up is about 1 times P SEQ .
KeywordsHybrid System Graphic Processing Unit Clock Cycle Field Programmable Gate Array Clock Frequency
Unable to display preview. Download preview PDF.
- 1.Betkaoui, B., Thomas, D., Luk, W.: Comparing Performance and Energy Efficiency of FPGAs and GPUs for High Productivity Computing. In: Int. Conf. on Field-Programmable Technology (FPT), pp. 94–101 (2010)Google Scholar
- 2.Cope, B., Cheung, P., Luk, W., Witt, S.: Have GPUs made FPGAs redundant in the field of Video Processing? In: Int. Conf. on Field-Programmable Technology (FPT), pp. 111–118 (2005)Google Scholar
- 3.Cray Incorporate, Seattle, Washington, USA: Cray XD1 System Overview, version 1.4 (2006)Google Scholar
- 4.Hampel, V., Sobe, P., Maehle, E.: Designing Coprocessors for Hybrid Compute Systems. In: Int. Symp. on Parallel and Distributed Processing (IPDPS), pp. 1–8 (2008)Google Scholar
- 6.Hollander, R.M., Bolotoff, P.V.: RAMSpeed, a Cache and Memory Benchmarking Tool (2009), http://alasir.com/software/ramspeed/ (visited on September 23, 2011)
- 7.Jones, D., Powell, A., Bouganis, C.S., Cheung, P.: GPU versus FPGA for High Productivity Computing. In: Int. Conf. on Field Programmable Logic and Applications (FPL), pp. 119–124 (2010)Google Scholar
- 8.NVIDIA Corporation, Santa Clara, California, USA: NVIDIA CUDA C Programming Guide, http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf (visited on September 23, 2011)
- 9.NVIDIA Corporation, Santa Clara, California, USA: Technical Brief NVIDIA GeForce GTX 200 GPU Architectural Overview, http://www.nvidia.com/docs/IO/55506/GeForce_GTX_200_GPU_Technical_Brief.pdf (visited on September 23, 2011)
- 10.Suffern, K.G.: Ray Tracing from the Ground up. A K Peters Ltd. (2007)Google Scholar