Methodology to Increase the Computational Speed to Obtain the Fractal Dimension Using GPU Programming
Computing the fractal dimension (FD) can be a very time-consuming process. Nowadays, the data precision or resolution of many sensors is increasingly high (magnetic resonance, ultrasounds, microcomputed tomography, etc.) and, furthermore, some applications require 3D data post-processing. The processing of large data sets is also very common in several analyses and applications. Therefore, fast algorithms for computing the FD are required, above all for interactive applications. Graphics processing unit (GPU) programming has become a standard tool for optimizing certain sorts of time-consuming algorithms. If the problem fits the GPU programming model well, high speedups can be achieved. CUDA and OpenCL are two of the most popular GPU technologies since they do not require special knowledge of computer graphics programming. In this chapter, we present our experience optimizing the processing time of the classic box-counting algorithm to compute the FD by means of CUDA and OpenCL GPU programming. Speedups of up to 28× (CUDA) and 6.3× (OpenCL) against the single-thread CPU version of the algorithm have been obtained. CUDA results are better because the box-counting algorithm has a strong dependency on sorting, and the OpenCL implementations of the best sorting algorithms are not as efficient as the CUDA ones.
Keywords3D fractal dimension GPU Box counting CUDA OpenCL
This work has been partially supported by the University of Jaén, the Caja Rural de Jaén, the Ministry of Economy and Competitiveness, and the European Union (via ERDF funds) through the research projects UJA2013/12/04, UJA2013/08/35, and TIN2014-58218-R.
- 1.3DVIA repository. 2015. Available from http://www.3dvia.com.
- 2.Aim@shape repository. 2015. Available from http://visionair.ge.imati.cnr.it.
- 3.Bainville E. OpenCL sorting. 2011. Available from http://www.bealto.com/gpu-sorting_intro.html.
- 6.BrainWeb: Simulated Brain Database. 2015. Available from http://brainweb.bic.mni.mcgill.ca/brainweb.
- 7.OpenCL Data Parallel Primitives Library. 2011. Available from http://code.google.com/p/clpp/.
- 8.Digital Mars. Process.h C Library Specification. 2015. Available from http://www.digitalmars.com/rtl/process.html.
- 10.Hoberock J, Bell N. Thrust: a parallel template library. v1.8.0, February 2015. Available from http://thrust.github.com/.
- 12.Intel. Intel threading building blocks reference website. 2015. Available from http://threadingbuildingblocks.org.
- 16.Kirk DB, Hwu WW. Programming massively parallel processors. Hands-on approach. Burlington: Morgan Kaufmann Publishers; 2010.Google Scholar
- 17.Khronos Working Group. The OpenCL specification. 2015. http://www.khronos.org/opencl/
- 20.Manjón JV, Coupé P. volBrain: an online MRI brain volumetry system. In: Proceeding of organization for human brain mapping; 2015 June 14–18; Honolulu. 2015.Google Scholar
- 24.NVIDIA. NVIDIA CUDA toolkit documentation 2015. Available from http://docs.nvidia.com/cuda/index.html.
- 25.Polok L, Ila V, Smrz P. Fast radix sort for sparse linear algebra on GPU. Simul Ser. 2014;46(5):79–86.Google Scholar
- 27.Shamonin DP, Bron EE, Lelieveldt BPF, Smits M, Klein S, Staring M. Fast parallel image registration on CPU and GPU for diagnostic classification of Alzheimer’s disease. Front Neuroinforma. 2014;7:1–15.Google Scholar
- 29.Shamoto H, Shirahata K, Drozd A, Sato H, Matsuoka S. Large-scale distributed sorting for GPU-based heterogeneous supercomputers. In: Proceeding of the IEEE international conference on big data; 2014 October 27–30; Washington, DC: IEEE press; 2014.Google Scholar
- 33.Stanford University: The Stanford 3D scanning repository. 2015. Available from http://graphics.stanford.edu/data/3Dscanrep.
- 34.Tzeng YC, Fan KT, Su YJ, Chen KS. A parallel differential box counting algorithm applied to hyperspectral image classifications. In: Proceeding of the IEEE International Geoscience and Remote Sensing Symposium; 2009 July 12–17; Cape Town: IEEE press; 2009.Google Scholar