Heterogeneous Systems for Energy Efficient Scientific Computing

  • Qiang Liu
  • Wayne Luk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7199)


This paper introduces a novel approach for exploring heterogeneous computing engines which include GPUs and FPGAs as accelerators. Our goal is to systematically automate finding solutions for such engines that maximize energy efficiency while meeting requirements in throughput and in resource constraints. The proposed approach, based on a linear programming model, enables optimization of system throughput and energy efficiency, and analysis of energy efficiency sensitivity and power consumption issues. It can be used in evaluating current and future computing hardware and interfaces to identify appropriate combinations. A heterogeneous system containing a CPU, a GPU and an FPGA with a PCI Express interface is studied based on the High Performance Linpack application. Results indicate that such a heterogeneous computing system is able to provide energy-efficient solutions to scientific computing with various performance demands. The improvement of system energy efficiency is more sensitive to some of the system components, for example in the studied system concurrently improving the energy efficiency of the interface and the GPU by 10 times could lead to over 10 times improvement of the system energy efficiency.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Eijkhout, V., et al.: Introduction to high-performance scientific computing (May 2011),
  2. 2.
    Feng, W.-C.: The importance of being low power in high performance computing. Cyberinfrastructure Technology Watch Quarterly 1 (2005)Google Scholar
  3. 3.
    Ding, Y., et al.: Towards energy efficient scaling of scientific codes. In: IPDPS, pp. 1–8 (April 2008)Google Scholar
  4. 4.
    Wang, G., Ren, X.: Power-efficient work distribution method for CPU-GPU heterogeneous system. In: ISPA, pp. 122–129 (September 2010)Google Scholar
  5. 5.
    Turkington, K., et al.: FPGA based acceleration of the linpack benchmark: A high level code transformation approach. In: FPL, pp. 1–6 (August 2006)Google Scholar
  6. 6.
    Fatica, M.: Accelerating linpack with CUDA on heterogenous clusters. In: GPGPU-2, pp. 46–51 (March 2009)Google Scholar
  7. 7.
    Ogata, Y., et al.: An efficient, model-based CPU-GPU heterogeneous FFT library, pp. 1–10 (April 2008)Google Scholar
  8. 8.
    Tse, A., et al.: Dynamic scheduling Monte-Carlo framework for multi-accelerator heterogeneous clusters. In: FPT, pp. 233–240 (December 2010)Google Scholar
  9. 9.
    Barak, A., et al.: A package for openCL based heterogeneous computing on clusters with many GPU devices. In: Int. Conf. on Cluster Computing Workshops and Posters, pp. 1–7 (September 2010)Google Scholar
  10. 10.
    Liu, Q., Luk, W.: Objective-driven workload allocation in heterogeneous computing systems. In: FPT (December 2011)Google Scholar
  11. 11.
    Liu, Q., et al.: Combining optimizations in automated low power design. In: DATE, pp. 1791–1796 (2010)Google Scholar
  12. 12.
    Hong, S., Kim, H.: An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In: ISCA, pp. 152–163 (2009)Google Scholar
  13. 13.
    Liu, Q., et al.: Optimising designs by combining model-based and pattern-based transformations. In: FPL, pp. 308–313 (2009)Google Scholar
  14. 14.
    Petitet, A., et al.: HPL - a portable implementation of the high-performance linpack benchmark for distributed-memory computers, version 2.0,
  15. 15.
  16. 16.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Qiang Liu
    • 1
  • Wayne Luk
    • 2
  1. 1.School of Electronic Information EngineeringTianjin UniversityTianjinChina
  2. 2.Department of ComputingImperial College LondonLondonUK

Personalised recommendations