Abstract
The Cell processor is a heterogeneous multi-core processor with one Power Processing Engine (PPE) core and eight Synergistic Processing Engine (SPE) cores. Each SPE has a directly accessible small local memory (256K), and it can access the system memory through DMA operations. Cell programming is complicated both by the need to explicitly manage DMA data transfers for SPE computation, as well as the multiple layers of parallelism provided in the architecture, including heterogeneous cores, multiple SPE cores, multithreading, SIMD units, and multiple instruction issue. There is a significant amount of ongoing research in programming models and tools that attempts to make it easy to exploit the computation power of the Cell architecture. In our work, we explore supporting OpenMP on the Cell processor. OpenMP is a widely used API for parallel programming. It is attractive to support OpenMP because programmers can continue using their familiar programming model, and existing code can be re-used. We base our work on IBM’s XL compiler, which already has OpenMP support for AIX multi-processor systems built with Power processors. We developed new components in the XL compiler and a new runtime library for Cell OpenMP that utilizes the Cell SDK libraries to target specific features of the new hardware platform. To describe the design of our Cell OpenMP implementation, we focus on three major issues in our system: 1) how to use the heterogeneous cores and synchronization support in the Cell to optimize OpenMP threads; 2) how to generate thread code targeting the different instruction sets of the PPE and SPE from within a compiler that takes single-source input; 3) how to implement the OpenMP memory model on the Cell memory system. We present experimental results for some SPEC OMP 2001 and NAS benchmarks to demonstrate the effectiveness of this approach. Also, we can observe detailed runtime event sequences using the visualization tool Paraver, and we use the insight into actual thread and synchronization behaviors to direct further optimizations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
IBM XL Compiler for the Cell BE, http://www.alphaworks.ibm.com/tech/cellcompiler
NAS parallel benchmarks, http://www.nas.nasa.gov/Resources/Software/npb.html
Paraver, http://www.cepba.upc.es/paraver
SDK for Cell, http://www-128.ibm.com/developerworks/power/cell
Spec OMP benchmarks, http://www.spec.org
Eichenberger, A., et al.: Vectorization for SIMD Architecture with Alignment Constraints. Programming Language Design and Implementation (PLDI) (2003)
Eichenberger, A., et al.: Optimizing Compiler for the Cell Processor. In: Conference on Parallel Architecture and Compiler Techniques (PACT) (2005)
Pham, D., et al.: The design and implementation of a first-generation cell processor. In: IEEE International Solid-State Circuits Conference (ISSCC) (February 2005)
Gordon, M., et al.: Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (October 2006)
Bellens, P., et al.: CellSs: a Programming Model for the Cell BE Architecture. Supercomputing (SC) (2006)
Williams, S., et al.: The Potential of the Cell Processor for Scientific Computing. In: Conference on Computing Frontiers (2006)
Chen, T., et al.: Optimizing the use of static buffers for DMA on a CELL chip. In: Almási, G.S., Caşcaval, C., Wu, P. (eds.) KSEM 2006. LNCS, vol. 4382. Springer, Heidelberg (2007)
Kistler, M., Perrone, M., Petrini, F.: CELL multiprocessor communication network: Built for Speed. IEEE Micro 26(3) (May/June 2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
O’Brien, K., O’Brien, K., Sura, Z., Chen, T., Zhang, T. (2008). Supporting OpenMP on Cell. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds) A Practical Programming Model for the Multi-Core Era. IWOMP 2007. Lecture Notes in Computer Science, vol 4935. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69303-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-69303-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69302-4
Online ISBN: 978-3-540-69303-1
eBook Packages: Computer ScienceComputer Science (R0)