International Journal of Parallel Programming

, Volume 37, Issue 5, pp 488–507

The Bottom-Up Implementation of One MILC Lattice QCD Application on the Cell Blade

  • Guochun Shi
  • Volodymyr Kindratenko
  • Steven Gottlieb

DOI: 10.1007/s10766-009-0102-0

Cite this article as:
Shi, G., Kindratenko, V. & Gottlieb, S. Int J Parallel Prog (2009) 37: 488. doi:10.1007/s10766-009-0102-0


We report the results of the bottom-up implementation of one MILC lattice quantum chromodynamics (QCD) application on the Cell Broadband Engine™ processor. In our implementation, we preserve MILC’s framework for scaling the application to run on a large number of compute nodes and accelerate computationally intensive kernels on the Cell’s synergistic processor elements. Speedups of 3.4 × for the 8 × 8 × 16 × 16 lattice and 5.7 × for the 16 × 16 × 16 × 16 lattice are obtained when comparing our implementation of the MILC application executed on a 3.2 GHz Cell processor to the standard MILC code executed on a quad-core 2.33 GHz Intel Xeon processor. We provide an empirical model to predict application performance for a given lattice size. We also show that performance of the compute-intensive part of the application on the Cell processor is limited by the bandwidth between main memory and the Cell’s synergistic processor elements, whereas performance of the application’s parallel execution framework is limited by the bandwidth between main memory and the Cell’s power processor element.


Cell Broadband Engine Quantum chromodynamics MILC 

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Guochun Shi
    • 1
  • Volodymyr Kindratenko
    • 1
  • Steven Gottlieb
    • 2
  1. 1.National Center for Supercomputing ApplicationsUniversity of IllinoisUrbanaUSA
  2. 2.Department of PhysicsIndiana UniversityBloomingtonUSA

Personalised recommendations