An Approach for Performance Estimation of Hybrid Systems with FPGAs and GPUs as Coprocessors

  • Volker Hampel
  • Thilo Pionteck
  • Erik Maehle
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7179)


This paper presents an approach for modeling the achievable speed-ups of FPGAs (Field Programmable Gate Arrays) or GPUs (Graphic Processing Units) as coprocessors in hybrid computing systems. The underlying computation model assumes that the coprocessors are separate devices and that their input and output data are transferred from and into the system’s memory. The model considers all overheads involved when (sub-)tasks are performed on a coprocessor instead of the CPU. By means of a sample application the validity of the model is checked against measured values. In addition, the theoretical maximum speed-ups of two hybrid systems compared to an optimal single core CPU implementation are approximated. Using penalty factor P SEQ as a measure to which degree a program cannot be fully parallelized due to data dependencies, a system with a Nvidia GTX 285 GPU achieves a speed-up of 2.7 times P SEQ , while for a single node of a Cray XD1 with a Xilinx Virtex4 LX160 the speed-up is about 1 times P SEQ .


Hybrid System Graphic Processing Unit Clock Cycle Field Programmable Gate Array Clock Frequency 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Betkaoui, B., Thomas, D., Luk, W.: Comparing Performance and Energy Efficiency of FPGAs and GPUs for High Productivity Computing. In: Int. Conf. on Field-Programmable Technology (FPT), pp. 94–101 (2010)Google Scholar
  2. 2.
    Cope, B., Cheung, P., Luk, W., Witt, S.: Have GPUs made FPGAs redundant in the field of Video Processing? In: Int. Conf. on Field-Programmable Technology (FPT), pp. 111–118 (2005)Google Scholar
  3. 3.
    Cray Incorporate, Seattle, Washington, USA: Cray XD1 System Overview, version 1.4 (2006)Google Scholar
  4. 4.
    Hampel, V., Sobe, P., Maehle, E.: Designing Coprocessors for Hybrid Compute Systems. In: Int. Symp. on Parallel and Distributed Processing (IPDPS), pp. 1–8 (2008)Google Scholar
  5. 5.
    Hampel, V., Goronzy, G., Maehle, E.: A Code-Based Analytical Approach for Using Separate Device Coprocessors in Computing Systems. In: Berekovic, M., Fornaciari, W., Brinkschulte, U., Silvano, C. (eds.) ARCS 2011. LNCS, vol. 6566, pp. 1–12. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  6. 6.
    Hollander, R.M., Bolotoff, P.V.: RAMSpeed, a Cache and Memory Benchmarking Tool (2009), (visited on September 23, 2011)
  7. 7.
    Jones, D., Powell, A., Bouganis, C.S., Cheung, P.: GPU versus FPGA for High Productivity Computing. In: Int. Conf. on Field Programmable Logic and Applications (FPL), pp. 119–124 (2010)Google Scholar
  8. 8.
    NVIDIA Corporation, Santa Clara, California, USA: NVIDIA CUDA C Programming Guide, (visited on September 23, 2011)
  9. 9.
    NVIDIA Corporation, Santa Clara, California, USA: Technical Brief NVIDIA GeForce GTX 200 GPU Architectural Overview, (visited on September 23, 2011)
  10. 10.
    Suffern, K.G.: Ray Tracing from the Ground up. A K Peters Ltd. (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Volker Hampel
    • 1
  • Thilo Pionteck
    • 1
  • Erik Maehle
    • 1
  1. 1.Institute of Computer EngineeringUniversity of LübeckLübeckGermany

Personalised recommendations