Parallel Medical Image Reconstruction: From Graphics Processors to Grids

  • Maraike Schellmann
  • Sergei Gorlatch
  • Dominik Meiländer
  • Thomas Kösters
  • Klaus Schäfers
  • Frank Wübbeling
  • Martin Burger
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5698)


We present a variety of possible parallelization approaches for a real-world case study using several modern parallel and distributed computer architectures. Our case study is a production-quality, time-intensive algorithm for medical image reconstruction used in computer tomography. We describe how this algorithm can be parallelized for the main kinds of contemporary parallel architectures: shared-memory multiprocessors, distributed-memory clusters, graphics processors, the Cell processor and, finally, how various architectures can be accessed in a distributed Grid environment. The main contribution of the paper, besides the parallelization approaches, is their systematic comparison regarding four important criteria: performance, programming comfort, accessibility, and cost-effectiveness. We report results of experiments on particular parallel machines of different architectures that confirm the findings of our systematic comparison.


Positron Emission Tomography Multicore Processor Graphic Processor Hybrid Cluster Cell Processor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Schäfers, K.P., Reader, A.J., Kriens, M., Knoess, C., Schober, O., Schäfers, M.: Performance Evaluation of the 32-Module QuadHIDAC Small-Animal PET Scanner. Journal Nucl. Med. 46(6), 996–1004 (2005)Google Scholar
  2. 2.
    Reader, A.J., Erlandsson, K., Flower, M.A., Ott, R.J.: Fast Accurate Iterative Reconstruction for Low-Statistics Positron Volume Imaging. Phys. Med. Biol. 43(4), 823–834 (1998)CrossRefGoogle Scholar
  3. 3.
    Shepp, L.A., Vardi, Y.: Maximum Likelihood Reconstruction for Emission Tomography. IEEE Trans. Med. Imag 1, 113–122 (1982)CrossRefGoogle Scholar
  4. 4.
    Siddon, R.L.: Fast Calculation of the Exact Radiological Path for a Three-Dimensional CT Array. Medical Physics 12(2), 252–255 (1985)CrossRefGoogle Scholar
  5. 5.
    Hoefler, T., Schellmann, M., Gorlatch, S., Lumsdaine, A.: Communication Optimization for Medical Image Reconstruction Algorithms. In: Lastovetsky, A., Kechadi, T., Dongarra, J. (eds.) EuroPVM/MPI 2008. LNCS, vol. 5205, pp. 75–83. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Schellmann, M., Gorlatch, S.: Comparison of Two Decomposition Strategies for Parallelizing the 3D List-Mode OSEM Algorithm. In: Proceedings Fully 3D Meeting and HPIR Workshop, pp. 37–40 (2007)Google Scholar
  7. 7.
    Schellmann, M., Vörding, J., Gorlatch, S., Meiländer, D.: Cost-Effective Medical Image Reconstruction: From Clusters to Graphics Processing Units. In: Proceedings of the 2008 Conference on Computing frontiers, pp. 283–292. ACM, New York (2008)CrossRefGoogle Scholar
  8. 8.
    Meiländer, D., Schellmann, M., Gorlatch, S.: Implementing a Data-Parallel Application with Low Data Locality on Multicore Processors. In: International Conference on Architecture of Computing Systems - Workshop Proceedings, pp. 57–64. VDE (2009)Google Scholar
  9. 9.
    OpenMP Architecture Review Board. OpenMP Application Program Interface (May 2008)Google Scholar
  10. 10.
    Message Passing Interface Forum. MPI: A Message-Passing Interface Standard,
  11. 11.
    NVIDIA. NVIDIA CUDA Compute Unified Device Architecture,
  12. 12.
    Ryoo, S., Rodrigues, C., Baghsorkhi, S., Stone, S., Kirk, D., Hwu, W.: Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA. In: PPoPP 2008: Proc. of the 13th ACM SIGPLAN Symposium, pp. 73–82 (2008)Google Scholar
  13. 13.
    IBM. Software Development Kit for Multicore Acceleration Version 3.0,
  14. 14.
    Scali MPI connect,
  15. 15.
    Kegel, P., Schellmann, M., Gorlatch, S.: Using OpenMP and Threading Building Blocks for Parallelizing Medical Imaging: A Comparison. In: Euro-Par 2009 - Parallel Processing. LNCS, vol. 5704. Springer, Heidelberg (to appear, 2009)Google Scholar
  16. 16.
    Tsunamic Technologies Inc., Cluster computing on demand,
  17. 17.
    Schellmann, M., Böhm, D., Wichmann, S., Gorlatch, S.: Towards a Grid System for Medical Image Reconstruction, pp. 3019–3025. IEEE Computer Society Press, Los Alamitos (2007)Google Scholar
  18. 18.
    Khronos Group. OpenCL - The open standard for parallel programming of heterogeneous systems,
  19. 19.
    Brune, C., Sawatzky, A., Burger, M.: Bregman-EM-TV Methods with Application to Optical Nanoscopy. In: Proceedings of the 2nd International Conference on Scale Space and Variational Methods in Computer Vision. LNCS, vol. 5567, pp. 235–246. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  20. 20.
    Kösters, T., Wübbeling, F., Natterer, F.: Scatter Correction in PET Using the Transport Equation. In: IEEE Nuclear Science Symposium and Medical Imaging Conference Record, pp. 3305–3309. IEEE, Los Alamitos (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Maraike Schellmann
    • 1
  • Sergei Gorlatch
    • 1
  • Dominik Meiländer
    • 1
  • Thomas Kösters
    • 1
  • Klaus Schäfers
    • 1
  • Frank Wübbeling
    • 1
  • Martin Burger
    • 1
  1. 1.University of MünsterGermany

Personalised recommendations