Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration
The development of standard processors changed in the last years moving from bigger, more complex, and faster cores to putting several more simple cores onto one chip. This changed also the way programs are written in order to leverage the processing power of multiple cores of the same processor. In the beginning, programmers had to divide and distribute the work by hand to the available cores and to manage threads in order to use more than one core. Today, several frameworks exist to relieve the programmer from such tasks. In this paper, we present five such frameworks for parallelization on shared memory multi-core architectures, namely OpenMP, Cilk++, Threading Building Blocks, RapidMind, and OpenCL. To evaluate these frameworks, a real world application from medical imaging is investigated, the 2D/3D image registration. In an empirical study, a fine-grained data parallel and a coarse-grained task parallel parallelization approach are used to evaluate and estimate different aspects like usability, performance, and overhead of each framework.
KeywordsParallelization Frameworks Evaluation Medical Imaging 2D/3D Image Registration OpenMP Cilk++ Threading Building Blocks RapidMind OpenCL
Unable to display preview. Download preview PDF.
- 1.Amdahl, G.: Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities. In: Proceedings of the AFIPS Spring Joint Computing Conference, pp. 483–485. ACM, New York (1967)Google Scholar
- 5.Kejariwal, A., Nicolau, A., Banerjee, U., Veidenbaum, A., Polychronopoulos, C.: Cache-Aware Iteration Space Partitioning. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 269–270. ACM, Salt Lake (2008)Google Scholar
- 7.Leiserson, C.: The Cilk++ Concurrency Platform. In: Proceedings of the 46th Annual Design Automation Conference, pp. 522–527. ACM, New York (2009)Google Scholar
- 8.McCool, M., Du Toit, S.: Metaprogramming GPUs with Sh. AK Peters, Ltd, Stanford (2004)Google Scholar
- 9.Membarth, R., Hannig, F., Teich, J., Körner, M., Eckert, W.: Comparison of Parallelization Frameworks for Shared Memory Multi-Core Architectures. In: Proceedings of the Embedded World Conference, Nuremberg, Germany (March 2010)Google Scholar
- 10.Muchnick, S.: Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco (1997)Google Scholar
- 11.Munshi, A.: The OpenCL Specification. Khronos OpenCL Working Group (2009)Google Scholar
- 12.Olivier, S., Prins, J.: Comparison of OpenMP 3.0 and Other Task Parallel Frameworks on Unbalanced Task Graphs. International Journal of Parallel Programming, 1–20 (2010)Google Scholar
- 13.RapidMind: RapidMind Development Platform Documentation. RapidMind Inc. (June 2009)Google Scholar
- 14.Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O’Reilly Media, Inc., Sebastopol (2007)Google Scholar
- 15.Trucco, E., Verri, A.: Introductory Techniques for 3-D Computer Vision. Prentice-Hall, New Jersey (1998)Google Scholar