Journal of Signal Processing Systems

, Volume 55, Issue 1–3, pp 229–250 | Cite as

Non-rigid Registration for Large Sets of Microscopic Images on Graphics Processors

  • Antonio Ruiz
  • Manuel Ujaldon
  • Lee Cooper
  • Kun Huang


Microscopic imaging is an important tool for characterizing tissue morphology and pathology. 3D reconstruction and visualization of large sample tissue structure requires registration of large sets of high-resolution images. However, the scale of this problem presents a challenge for automatic registration methods. In this paper we present a novel method for efficient automatic registration using graphics processing units (GPUs) and parallel programming. Comparing a C++ CPU implementation with Compute Unified Device Architecture (CUDA) libraries and pthreads running on GPU we achieve a speed-up factor of up to 4.11× with a single GPU and 6.68× with a GPU pair. We present execution times for a benchmark composed of two sets of large-scale images: mouse placenta (16K ×16K pixels) and breast cancer tumors (23K ×62K pixels). It takes more than 12 hours for the genetic case in C++ to register a typical sample composed of 500 consecutive slides, which was reduced to less than 2 hours using two GPUs, in addition to a very promising scalability for extending those gains easily on a large number of GPUs in a distributed system.


Microscopic imaging  Image registration and segmentation  Pattern analysis  Feature detection  Graphics processors High-performance computing 



This work was partially supported by the Ministry of Education of Spain (TIC2003-06623, PR-2007-0014), Junta de Andalucía of Spain (P06-TIC-02109), US NIH grant R01 DC06458-01A1 and the startup fund from the Department of Biomedical Informatics at the Ohio State University, US.

We thank Dr. Gustavo Leone from the Ohio State University Cancer Center for providing us the images from mouse placenta and mouse mammary gland we used during the experiments outlined in this paper. We also thank Dr. Dennis Sessanna and Dr. Donald Stredney from the Ohio Supercomputing Center for providing us access to the BALE visualization cluster where most of our execution times were obtained.


  1. 1.
    Levinthal, C., & Ware, R. (1972). Three-dimensional reconstruction from serial sections. Nature, 236, 207–210.CrossRefGoogle Scholar
  2. 2.
    Capowski, J. (1977). Computer-aided reconstruction of neuron trees from several sections. Computational Biomedical Research, 10(6), 617–629.CrossRefGoogle Scholar
  3. 3.
    Johnson, E., & Capowski, J. (1983). A system for the three-dimensional reconstruction of biological structures. Computational Biomedical Research, 16(1), 79–87.CrossRefGoogle Scholar
  4. 4.
    Huijismans, D., Lamers, W., Los, J., & Strackee, J. (1986). Toward computerized morphometric facilities: A review of 58 software packages for computer-aided three-dimensional reconstruction, quantification, and picture generation from parallel serial sections. The Anatomical Record, 216(4), 449–470.CrossRefGoogle Scholar
  5. 5.
    Moss, V. (1989). The computation of 3-dimensional morphology from serial sections. European Journal of Cell Biology, 48, 61–64.Google Scholar
  6. 6.
    Brandt, R., Rohlfing, T., Rybak, J., Krofczik, S., Maye, A., Westerhoff, M., et al. (2005). A three-dimensional average-shape atlas of the honeybee brain and its applications. The Journal of Comparative Neurology, 492(1), 1–19.CrossRefGoogle Scholar
  7. 7.
    Hajnal, J., Derek, H., & Hawkes, D. (2001). Medical image registration. Boca Raton: CRC.Google Scholar
  8. 8.
    Goshtasby, A. (2005). 2-D and 3-D image registration: For medical, remote sensing, and industrial applications. New York: Wiley-Interscience.Google Scholar
  9. 9.
    Streicher, J., Markus, D., Bernhard, S., Sporle, R., Schughart, K., & Muller, G. (2000). Computer-based three-dimensional visualization of developmental gene expression. Nature Genetics, 25, 147–152.CrossRefGoogle Scholar
  10. 10.
    Braumann, U., Kuska, J., Einenkel, J., Horn, L., Luffler, M., & Huckel, M. (2005). Three-dimensional reconstruction and quantification of cervical carcinoma invasion fronts from histological serial sections. IEEE Transactions on Medical Imaging, 24(10), 1286–1307.CrossRefGoogle Scholar
  11. 11.
    Crum, W., Hartkens, T., & Hill, D. (2004). Non-rigid image registration: Theory and practice. The British Journal of Radiology, 77, S140–S153.CrossRefGoogle Scholar
  12. 12.
    Hill, W., & Baldock, R. (2003). The constrained distance transform: Interactive atlas registration with large deformations through constrained distance. In Proceedings of the workshop on image registration in deformable environments.Google Scholar
  13. 13.
    Yoo, T. (2004). Insight into images: Principles and practice for segmentation, registration, and image analysis. AK, Peters.Google Scholar
  14. 14.
    Sarma, S., Kerwin, J., Puelles, L., Scott, M., Strachan, T., Feng, G., et al. (2005). 3d modelling, gene expression mapping and post-mapping image analysis in the developing human brain. Brain Research Bulletin, 66(4–6), 449–453.CrossRefGoogle Scholar
  15. 15.
    Jenett, A., Schindelin, J., & Heisenberg, M. (2006). The virtual insect brain protocol: Creating and comparing standardized neuroanatomy. BMC Bioinformatics, 7, 544.CrossRefGoogle Scholar
  16. 16.
    Wenzel, P., Wu, L., Sharp, R., de Bruin, A., Chong, J., Chen, W., et al. (2007). Rb is critical in a mammalian tissue stem cell population. Genes & Development, 21(1), 85–97.CrossRefGoogle Scholar
  17. 17.
    Cooper, L., Huang, K., Sharma, A., Mosaliganti, K., & Pan, T. (2006). Registration vs. reconstruction: building 3-d models from 2-d microscopy images. In Proceedings of the workshop on multiscale biological imaging, data mining and informatics (pp. 57–58).Google Scholar
  18. 18.
    Huang, K., Cooper, L., Sharma, A., & Pan, T. (2006). Fast automatic registration algorithm for large microscopy images. In Proceedings of the IEEENLM life science systems & applications workshop (pp. 1–2).Google Scholar
  19. 19.
    Koshevoy, P., Tasdizen, T., & Whitaker, R. (2006). Implementation of an automatic slice-to-slice registration tool. University of Utah, SCI Institute Technical Report UUSCI-2006-018. (Online) Available:
  20. 20.
    Prescott, J., Clary, M., Wiet, G., Pan, T., & Huang, K. (2006). Automatic registration of large set of microscopic images using high-level. In Proceedings of the IEEE international symposium on medical imaging (pp. 1284–1287).Google Scholar
  21. 21.
    Mosaliganti, R., Pan, T., Sharp, R., Ridgway, R., Iyengar, S., Gulacy, A., et al. (2006). Registration and 3d visualization of large microscopy images. In Proceedings of the SPIE medical imaging meeting (pp. 6144:923–934).Google Scholar
  22. 22.
    Schmitt, O., Modersitzki, J., Heldmann, S., Wirtz, S., & Fischer, B. (2007). Image registration of sectioned brains. International Journal of Computer Vision, 73(1), 5–39.CrossRefGoogle Scholar
  23. 23.
    Botnen, M., & Ueland, H. (2004). The GPU as a computational resource in medical image processing. Dept. of Computer and Information Science, Norwegian Univ. of Science and Technology, Tech. Rep.Google Scholar
  24. 24.
    Owens, J. D., Luebke, D., Govindaraju, N., Harris, M., Kruger, J., Lefohn, A. E., et al. (2007). A survey of general-purpose computation on graphics hardware. Journal of Computer Graphics Forum, 26, 21–51.Google Scholar
  25. 25.
    Sharp, R., Ridgway, R., Mosaliganti, K., Wenzel, P., Pan, T., de Bruin, A., et al. (2007). Volume rendering phenotype differences in mouse placenta microscopy data. Computing in Science & Engineering, 9(1), 38–47 (January/February).CrossRefGoogle Scholar
  26. 26.
    Lewis, J. P. (1995). Fast normalized cross-correlation. In Vision interface. Canadian image processing and pattern recognition society (pp. 120–123). (Online) Available:
  27. 27.
    Compute Unified Device Architecture (CUDA) (2007). Home page maintained by Nvidia. Accessed 1 May 2008.
  28. 28.
    GPGPU (2007). A web site dedicated to the general-purpose on the GPU.
  29. 29.
    Fatica, M., Luebke, D., Buck, I., Owens, D., Harris, M., Stone, J., et al. (2007). Cuda tutorial at supercomputing 2007 (November). Accessed 28 Dec 2007.Google Scholar
  30. 30.
    CUFFT library (2007). Home page maintained by nvidia. Accessed 28 Dec 2007.
  31. 31.
    The FFTW library (2007). FFTW home page. Accessed 1 May 2008.
  32. 32.
    Cooper, L., Naidu, S., Leone, G., Saltz, J., & Huang, K. (2007). Registering high resolution microscopic images with different histochemical stainings - a tool for mapping gene expression with cellular structures. In Proceedings of the workshop on microscopic image analysis with applications in biomedicine.Google Scholar
  33. 33.
    Kim, T., & Im, Y.-J. (2003). Automatic satellite image registration by combination of matching and random sample consensus. IEEE Transactions on Geoscience and Remote Sensing, 41(5), 1111–1117.CrossRefGoogle Scholar
  34. 34.
    Ino, F., Ooyama, K., & Hagihara, K. (2005). A data distributed parallel algorithm for nonrigid image registration. Parallel Computing, 31, 19–43.CrossRefGoogle Scholar
  35. 35.
    Warfield, S., Jolesz, F., & Kikinis, R. (1998). A high performance computing approach to the registration of medical imaging data. Parallel Computing, 24, 1345–1368.CrossRefGoogle Scholar
  36. 36.
    Rohlfing, T., & Maurer, C. (1998). Nonrigid image registration in shared-memory multiprocessor environments with applications to brains, breasts, and bees. IEEE Transactions on Information Technology in Biomedicine, 7(1), 16–25.CrossRefGoogle Scholar
  37. 37.
    Ohara, M., Yeo, H., Savino, F., Iyengar, G., Gong, L., Inoue, H., et al. (2007). Real time mutual information-based linear registration on the cell broadband engine processor. In Proceedings of the IEEE international symposium on medical imaging (ISBI ) (pp. 33–36).Google Scholar
  38. 38.
    Fan, Z., Qiu, F., Kaufman, A., & Yoakum-Stover, S. (2006). GPU cluster for high performance computing. In Proceedings 2004 ACM/IEEE intl. conference for high performance computing, networking, storage and analysis (pp. 47–53). Washington DC, USA.Google Scholar
  39. 39.
    Wu, W., & Heng, P. (2004). A hybrid condensed finite element model with GPU acceleration for interactive 3d soft tissue cutting: Research articles. Computer Animation and Virtual Worlds, 15(3–4), 219–227.CrossRefGoogle Scholar
  40. 40.
    Zhao, Y., Han, Y., Fan, Z., Qiu, F., Kuo, Y., Kaufman, A., et al. (2007). Visual simulation of heat shimmering and mirage. IEEE Trans. on Visualization and Computer Graphics, 13(1), 179–189.CrossRefGoogle Scholar
  41. 41.
    Ino, F., Gomita, J., Kawasaki, Y., & Hagihara, K. (2006). A GPGPU approach for accelerating 2-d/3-d rigid registration of medical images. In Proceedings of the 4th international symposium on parallel and distributed processing and applications (ISPA) (pp. 769–780). Lecture Notes in Computer Science 4331. Berlin: Springer.Google Scholar
  42. 42.
    Hastreiter, P., Rezk-Salama, C., Nimsky, C., Lurig, C., & Greiner, G. (2000). Techniques for the analysis of the brain shift in neurosurgery. Computers & Graphics, 24(3), 385–389.CrossRefGoogle Scholar
  43. 43.
    Guha, S., Krisnan, S., & Venkatasubramanian, S. (2005). Data visualization and mining using the GPU. In tutorial at 11th ACM international conference on knowledge discovery and data mining (KDD 2005).Google Scholar
  44. 44.
    Hadwiger, M., Langer, C., Scharsach, H., & Buhler, K. (2004). State of the art report on GPU-based segmentation. VRVis Research Center, Vienna, Austria, Tech. Rep. TR-VRVIS-2004-17.Google Scholar
  45. 45.
    Fatahalian, K., Sugerman, J., & Hanrahan, P. (2004). Understanding the efficiency of GPU algorithms for matrix-matrix multiplication. In Proceedings of the ACM SIGGRAPH - EUROGRAPHICS workshop on graphics hardware (HWWS’04). Grenoble, France (August).Google Scholar
  46. 46.
    Moreland, K., & Angel, E. (2004). The FFTW on a GPU. In Proceedings of the ACM SIGGRAPH - EUROGRAPHICS workshop on graphics hardware (HWWS’03). San Diego, California, USA (August).Google Scholar
  47. 47.
    TESLA (2008). GPGPU high-end hardware solutions from Nvidia. Accessed 1 Jan 2008.
  48. 48.
    FireStream (2008). GPU hardware solutions from AMD/ATI. Accessed 1 Jan 2008.
  49. 49.
    The BALE Supercomputer at the Ohio Supercomputer Center (OSC).

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Antonio Ruiz
    • 1
  • Manuel Ujaldon
    • 1
  • Lee Cooper
    • 2
  • Kun Huang
    • 2
  1. 1.Computer Architecture Department, Campus TeatinosUniversity of MalagaMalagaSpain
  2. 2.Biomedical Informatics DepartmentOhio State UniversityColumbusUSA

Personalised recommendations