Programming Heterogeneous Multicore Systems Using Threading Building Blocks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6586)


Intel’s Threading Building Blocks (TBB) provide a high-level abstraction for expressing parallelism in applications without writing explicitly multi-threaded code. However, TBB is only available for shared-memory, homogeneous multicore processors. Codeplay’s Offload C++ provides a single-source, POSIX threads-like approach to programming heterogeneous multicore devices where cores are equipped with private, local memories—code to move data between memory spaces is generated automatically. In this paper, we show that the strengths of TBB and Offload C++ can be combined, by implementing part of the TBB headers in Offload C++. This allows applications parallelised using TBB to run, without source-level modifications, across all the cores of the Cell BE processor. We present experimental results applying our method to a set of TBB programs. To our knowledge, this work marks the first demonstration of programs parallelised using TBB executing on a heterogeneous multicore architecture.


Local Memory Memory Space Multicore Processor Bulk Transfer Host Memory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Hofstee, H.P.: Power efficient processor architecture and the Cell processor. In: HPCA, pp. 258–262. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  2. 2.
    Intel, Threading Building Blocks 3.0 for Open Source,
  3. 3.
    Cooper, P., Dolinsky, U., Donaldson, A., Richards, A., Riley, C., Russell, G.: Offload – automating code migration to heterogeneous multicore systems. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds.) HiPEAC 2010. LNCS, vol. 5952, pp. 337–352. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Codeplay Software Ltd, Offload: Community Edition,
  5. 5.
    Stroustrup, B.: The Design and Evolution of C++. Addison-Wesley, Reading (1994)Google Scholar
  6. 6.
  7. 7.
    Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: Characterization and architectural implications. In: PACT 2008, pp. 72–81. ACM, New York (2008)Google Scholar
  8. 8.
    Hoeflinger, J.P.: Extending OpenMP to Clusters (2006),
  9. 9.
    O’Brien, K., O’Brien, K.M., Sura, Z., Chen, T., Zhang, T.: Supporting OpenMP on Cell. International Journal of Parallel Programming 36(3), 289–311 (2008)CrossRefzbMATHGoogle Scholar
  10. 10.
    Khronos Group, The OpenCL specification,

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.Codeplay Software Ltd.EdinburghUK
  2. 2.School of Computing ScienceUniversity of GlasgowUK
  3. 3.Computing LaboratoryUniversity of OxfordUK

Personalised recommendations