Journal of Real-Time Image Processing

, Volume 13, Issue 1, pp 205–225 | Cite as

A matrix-free approach to efficient affine-linear image registration on CPU and GPU

  • Jan Rühaak
  • Lars König
  • Florian Tramnitzke
  • Harald Köstler
  • Jan Modersitzki
Special Issue Paper


This paper presents a generic approach to highly efficient image registration in two and three dimensions. Both monomodal and multimodal registration problems are considered. We focus on the important class of affine-linear transformations in a derivative-based optimization framework. Our main contribution is an explicit formulation of the objective function gradient and Hessian approximation that allows for very efficient, parallel derivative calculation with virtually no memory requirements. The flexible parallelism of our concept allows for direct implementation on various hardware platforms. Derivative calculations are fully matrix free and operate directly on the input data, thereby reducing the auxiliary space requirements from \({\mathcal {O}}(n)\) to \({\mathcal {O}}(1)\). The proposed approach is implemented on multicore CPU and GPU. Our GPU code outperforms a conventional matrix-based CPU implementation by more than two orders of magnitude, thus enabling usage in real-time scenarios. The computational properties of our approach are extensively evaluated, thereby demonstrating the performance gain for a variety of real-life medical applications.


Image registration Computational efficiency Parallel algorithms GPU programming Real-time processing 



J. Rühaak, L. König, F. Tramnitzke and J. Modersitzki received funding from the European Union, European Regional Development Fund, Grant No. 122-10-002. All authors declare that they have no conflicts of interest.


  1. 1.
    Alavi, A., et al.: Is PET-CT the only option? Eur. J. Nucl. Med. Mol. Imaging 34, 819–821 (2007)CrossRefGoogle Scholar
  2. 2.
    Berg, R., König, L., Rühaak, J., Lausen, R., Fischer, B.: Highly efficient image registration for embedded systems using a distributed multicore DSP architecture. J. Real Time Image Process. (2014). doi: 10.1007/s11554-014-0457-3
  3. 3.
    Björck, A.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)CrossRefzbMATHGoogle Scholar
  4. 4.
    Bronsert, P., Enderle-Ammour, K., Bader, M., Timme, S., Kuehs, M., Csanadi, A., Kayser, G., Kohler, I., Bausch, D., Hoeppner, J., et al.: Cancer cell invasion and EMT marker expression: a three-dimensional study of the human cancer-host interface. J. Pathol. 234(3), 410–422 (2014)CrossRefGoogle Scholar
  5. 5.
    Brown, L.G.: A survey of image registration techniques. ACM Comput. Surv. (CSUR) 24(4), 325–376 (1992)CrossRefGoogle Scholar
  6. 6.
    Buluc, A., Gilbert, J.R.: Parallel sparse matrix-matrix multiplication and indexing: implementation and experiments. SIAM J. Sci. Comput. 34(4), C170–C191 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Castro-Pareja, C.R., Jagadeesh, J.M., Shekhar, R.: FAIR: a hardware architecture for real-time 3-D image registration. IEEE Trans. Inf. Technol. Biomed. 7(4), 426–434 (2003)CrossRefGoogle Scholar
  8. 8.
    Collignon, A., Maes, F., Delaere, D., Vandermeulen, D., Suetens, P., Marchal, G.: Automated multi-modality image registration based on information theory. Inf. Process. Med. Imaging 3, 264–274 (1995)Google Scholar
  9. 9.
    Davis, T.A.: Direct Methods for Sparse Linear Systems, vol. 2. SIAM, Philadelphia (2006)CrossRefzbMATHGoogle Scholar
  10. 10.
    De Luca, V., Benz, T., Kondo, S., König, L., Lübke, D., Rothlübbers, S., Somphone, O., Allaire, S., Bell, M.L., Chung, D., et al.: The 2014 liver ultrasound tracking benchmark. Phys. Med. Biol. 60(14), 5571 (2015)CrossRefGoogle Scholar
  11. 11.
    Dennis Jr, J.E., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations, vol. 16. SIAM, Philadelphia (1996)CrossRefzbMATHGoogle Scholar
  12. 12.
    Ferroli, P., Franzini, A., Marras, C., Maccagnano, E., D’Incerti, L., Broggi, G.: A simple method to assess accuracy of deep brain stimulation electrode placement: pre-operative stereotactic CT + postoperative MR image fusion. Stereotact. Func. Neurosurg. 82(1), 14–19 (2004)CrossRefGoogle Scholar
  13. 13.
    Fischer, B., Modersitzki, J.: Fast diffusion registration. Contemp. Math. 313, 117–128 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Gigengack, F., Ruthotto, L., Burger, M., Wolters, C.H., Jiang, X., Schafers, K.P.: Motion correction in dual gated cardiac PET using mass-preserving image registration. IEEE Trans. Med. Imaging 31(3), 698–712 (2012)CrossRefGoogle Scholar
  15. 15.
    Haber, E., Modersitzki, J.: A multilevel method for image registration. SIAM J. Sci. Comput. 27(5), 1594–1607 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Haber, E., Modersitzki, J.: Intensity gradient based registration and fusion of multi-modal images. Methods Inf. Med. 46, 292–9 (2007)Google Scholar
  17. 17.
    Haber, E., Heldmann, S., Modersitzki, J.: An octree method for parametric image registration. SIAM J. Sci. Comput. 29(5), 2008–2023 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Haber, E., Heldmann, S., Modersitzki, J.: Adaptive mesh refinement for nonparametric image registration. SIAM J. Sci. Comput. 30(6), 3012–3027 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Harris, M., et al.: Optimizing parallel reduction in CUDA. NVIDIA Developer Technology 2(4) (2007).
  20. 20.
    Kabus, S., Lorenz, C.: Fast elastic image registration. In: Proceedings of the medical image analysis for the clinic: a grand challenge, pp. 81–89. (2010)Google Scholar
  21. 21.
    Köhn, A., Drexl, J., Ritter, F., König, M., Peitgen, HO.: GPU accelerated image registration in two and three dimensions. In: Bildverarbeitung für die Medizin 2006, Springer, pp. 261–265 (2006)Google Scholar
  22. 22.
    König, L., Rühaak, J.: A fast and accurate parallel algorithm for non-linear image registration using normalized gradient fields. In: 2014 IEEE 11th international symposium on biomedical imaging (ISBI), pp. 580–583 (2014)Google Scholar
  23. 23.
    König, L., Kipshagen, T., Rühaak, J.: A non-linear image registration scheme for real-time liver ultrasound tracking using normalized gradient fields. In: Proceedings of MICCAI challenge on liver ultrasound tracking (CLUST 2014) (2014)Google Scholar
  24. 24.
    König, L., Derksen, A., Hallmann, M., Papenberg, N.: Parallel and memory efficient multimodal image registration for radiotherapy using normalized gradient fields. In: 2015 IEEE 12th international symposium on biomedical imaging (ISBI) (2015)Google Scholar
  25. 25.
    Lange, T., Papenberg, N., Heldmann, S., Modersitzki, J., Fischer, B., Lamecker, H., Schlag, P.M.: 3D ultrasound-CT registration of the liver using combined landmark-intensity information. Int. J. Comput. Assist. Radiol. Surg. 4(1), 79–88 (2009)CrossRefGoogle Scholar
  26. 26.
    Lombardi, F., Spigler, R.: The evolution of the approach to scientific computing: a survey. J. Parallel Cloud Comput. 3(2), 32–42 (2014)Google Scholar
  27. 27.
    Maintz, J., Viergever, M.A.: A survey of medical image registration. Med. Image Anal. 2(1), 1–36 (1998)CrossRefGoogle Scholar
  28. 28.
    Modersitzki, J.: Numerical Methods for Image Registration. Oxford University Press, Oxford (2004)Google Scholar
  29. 29.
    Modersitzki, J.: FAIR: Flexible Algorithms for Image Registration, vol. 6. SIAM, Philadelphia (2009)CrossRefzbMATHGoogle Scholar
  30. 30.
    Murphy, K., Van Ginneken, B., Reinhardt, J.M., Kabus, S., Ding, K., Deng, X., Cao, K., Du, K., Christensen, G.E., Garcia, V., et al.: Evaluation of registration methods on thoracic CT: the EMPIRE10 challenge. IEEE Trans. Med. Imaging 30(11), 1901–1920 (2011)CrossRefGoogle Scholar
  31. 31.
    Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (1999)Google Scholar
  32. 32.
    NVIDIA Corporation: NVIDIA CUDA C Programming Guide. NVIDIA Corporation, Santa Clara (2014)Google Scholar
  33. 33.
    Powell, M.J.: An efficient method for finding the minimum of a function of several variables without calculating derivatives. The Comput. J. 7(2), 155–162 (1964)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Rühaak, J., Heldmann, S., Kipshagen, T., Fischer, B.: Highly accurate fast lung CT registration. In: SPIE Medical Imaging 2013, image processing, pp. 86,690Y–86,690Y–9 (2013a)Google Scholar
  35. 35.
    Rühaak, J., König, L., Hallmann, M., Papenberg, N., Heldmann, S., Schumacher, H., Fischer, B.: A fully parallel algorithm for multimodal image registration using normalized gradient fields. In: 2013 IEEE 10th international symposium on biomedical imaging (ISBI), pp. 572–575 (2013b)Google Scholar
  36. 36.
    Rühaak, J., Derksen, A., Heldmann, S., Hallmann, M., Meine, H.: Accurate CT-MR image registration for deep brain stimulation: a multi-observer evaluation study. In: SPIE Medical Imaging 2015: image processing (2015)Google Scholar
  37. 37.
    Salas Gonzalez, D., Górriz, J., Ramírez, J., Lassl, A., Puntonet, C.: Improved Gauss–Newton optimisation methods in affine registration of SPECT brain images. Electr. Lett. 44(22), 1291–1292 (2008)CrossRefGoogle Scholar
  38. 38.
    Schmitt, O., Modersitzki, J., Heldmann, S., Wirtz, S., Fischer, B.: Image registration of sectioned brains. Int. J. Comput. Vis. 73(1), 5–39 (2007)CrossRefGoogle Scholar
  39. 39.
    Shams, R., Sadeghi, P., Kennedy, R., Hartley, R.: A survey of medical image registration on multicore and the GPU. IEEE Sig. Process. Mag. 27(2), 50–60 (2010a)CrossRefGoogle Scholar
  40. 40.
    Shams, R., Sadeghi, P., Kennedy, R., Hartley, R.: Parallel computation of mutual information on the GPU with application to real-time registration of 3D medical images. Comput. Methods Prog. Biomed. 99(2), 133–146 (2010b)CrossRefGoogle Scholar
  41. 41.
    Shi, L., Liu, W., Zhang, H., Xie, Y., Wang, D.: A survey of GPU-based medical image computing techniques. Quant. Imaging Med. Surg. 2(3), 188 (2012)Google Scholar
  42. 42.
    Sotiras, A., Davatzikos, C., Paragios, N.: Deformable medical image registration: a survey. IEEE Trans. Med. Imaging 32(7), 1153–1190 (2013)CrossRefGoogle Scholar
  43. 43.
    Soza, G., Bauer, M., Hastreiter, P., Nimsky, C., Greiner, G.: Non-rigid registration with use of hardware-based 3D Bézier functions. In: Medical image computing and computer-assisted intervention—MICCAI 2002, Springer, pp. 549–556 (2002)Google Scholar
  44. 44.
    Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66 (2010)CrossRefGoogle Scholar
  45. 45.
    Stürmer, M., Köstler, H., Rüde, U.: A fast full multigrid solver for applications in image processing. Numer. Linear Algebra Appl. 15(2–3), 187–200 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Tramnitzke, F., Rühaak, J., König, L., Modersitzki, J., Köstler, H.: GPU based affine linear image registration using normalized gradient fields. In: Proceedings of 7th international workshop on high performance computing for biomedical image analysis (HPC-MICCAI) (2014)Google Scholar
  47. 47.
    Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45(1), S61–S72 (2009)CrossRefGoogle Scholar
  48. 48.
    Verma, P.S., Wu, H., Langer, M.P., Das, I.J., Sandison, G.: Survey: real-time tumor motion prediction for image-guided radiation treatment. Comput. Sci. Eng. 13(5), 24–35 (2011)CrossRefGoogle Scholar
  49. 49.
    Viola, P., Wells III, W.M.: Alignment by maximization of mutual information. Int. J. Comput. Vis. 24(2), 137–154 (1997)CrossRefGoogle Scholar
  50. 50.
    Wilt, N.: The CUDA handbook: a comprehensive guide to GPU programming. Pearson Education, Upper Saddle River (2013)Google Scholar
  51. 51.
    Zitova, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput. 21(11), 977–1000 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Jan Rühaak
    • 1
  • Lars König
    • 1
  • Florian Tramnitzke
    • 1
  • Harald Köstler
    • 2
  • Jan Modersitzki
    • 3
  1. 1.Fraunhofer MEVISLübeckGermany
  2. 2.Universität Erlangen-Nürnberg, Lehrstuhl für SystemsimulationErlangenGermany
  3. 3.Universität zu Lübeck, Institute of Mathematics and Image ComputingLübeckGermany

Personalised recommendations