Design and Implementation of Parallel Nonrigid Image Registration Using Off-the-Shelf Supercomputers
This paper presents a new parallel algorithm for nonrigid image registration using off-the-shelf supercomputers, or clusters of PCs. Our algorithm realizes scalable registration for high resolution three-dimensional (3-D) images by employing three techniques: (1) data distribution; (2) data-parallel processing; and (3) dynamic load balancing. The experimental results show that our parallel implementation on a cluster of 64 off-the-shelf PCs (with 128 processors) registers liver CT images of 512×512×159 voxels within 8 minutes while a sequential implementation takes 12 hours. Furthermore, our implementation allows processors to use less memory, and thereby enables us to align 1024×1024×590 voxel images, which is not easy for single processor systems due to the restrictions on the memory space and the processing time.
Unable to display preview. Download preview PDF.
- 4.Buyya, R. (ed.): High Performance Cluster Computing. Prentice Hall PTR, Englewood Cliffs (1999)Google Scholar
- 7.Amdahl, G.: Validity of the single processor approach to achieving large-scale computing capabilities. In: Proc. AFIPS Conf., vol. 30, pp. 483–485 (1967)Google Scholar
- 9.Ma, K.L., Painter, J.S., Hansen, C.D., Krogh, M.F.: Parallel volume rendering using binary swap compositing. IEEE Comput. Graph. Appl. 14, 59–68 (1994)Google Scholar
- 10.O’Carroll, F., Tezuka, H., Hori, A., Ishikawa, Y.: The design and implementation of zero copy MPI using commodity hardware with a high performance network. In: Proc. 12th ACM Int’l Conf. on Supercomputing (ICS 1998), pp. 243–250 (1998)Google Scholar
- 11.Message Passing Interface Forum: MPI: A message-passing interface standard. Int’l J. of Supercomputer Applications and High Performance Computing 8, 159–416 (1994)Google Scholar