Special Issue Paper

Computer Science - Research and Development

, Volume 28, Issue 2, pp 175-184

First online:

Performance characterization of data-intensive kernels on AMD Fusion architectures

  • Kenneth LeeAffiliated withDepartment of Computer Science, Virginia Tech
  • , Heshan LinAffiliated withDepartment of Computer Science, Virginia Tech
  • , Wu-chun FengAffiliated withDepartment of Computer Science, Virginia Tech Email author 

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


The cost of data movement over the PCI Express bus is one of the biggest performance bottlenecks for accelerating data-intensive applications on traditional discrete GPU architectures. To address this bottleneck, AMD Fusion introduces a fused architecture that tightly integrates the CPU and GPU onto the same die and connects them with a high-speed, on-chip, memory controller. This novel architecture incorporates shared memory between the CPU and GPU, thus enabling several techniques for inter-device data transfer that are not available on discrete architectures. For instance, a kernel running on the GPU can now directly access a CPU-resident memory buffer and vice versa.

In this paper, we seek to understand the implications of the fused architecture on CPU-GPU heterogeneous computing by systematically characterizing various memory-access techniques instantiated with diverse memory-bound kernels on the latest AMD Fusion system (i.e., Llano A8-3850). Our study reveals that the fused architecture is very promising for accelerating data-intensive applications on heterogeneous platforms in support of supercomputing.


GPU AMD Fusion Memory transfer