Towards User Transparent Parallel Multimedia Computing on GPU-Clusters
The research area of Multimedia Content Analysis (MMCA) considers all aspects of the automated extraction of knowledge from multimedia archives and data streams. To satisfy the increasing computational demands of MMCA problems, the use of High Performance Computing (HPC) techniques is essential. As most MMCA researchers are not HPC experts, there is an urgent need for ‘familiar’ programming models and tools that are both easy to use and efficient.
Today, several user transparent library-based parallelization tools exist that aim to satisfy both these requirements. In general, such tools focus on data parallel execution on traditional compute clusters. As of yet, none of these tools also incorporate the use of many-core processors (e.g. GPUs), however. While traditional clusters are now being transformed into GPU-clusters, programming complexity vastly increases — and the need for easy and efficient programming models is as urgent as ever.
This paper presents our first steps in the direction of obtaining a user transparent programming model for data parallel and hierarchical multimedia computing on GPU-clusters. The model is obtained by extending an existing user transparent parallel programming system (applicable to traditional compute clusters) with a set of CUDA compute kernels. We show our model to be capable of obtaining orders-of-magnitude speed improvements, without requiring any additional effort from the application programmer.
KeywordsTotal Execution Time Thread Block Generalize Convolution Traditional Cluster Constant Memory
Unable to display preview. Download preview PDF.