Pipelined Multi-GPU MapReduce for Big-Data Processing

Download Book (15,084 KB) As a courtesy to our readers the eBook is provided DRM-free. However, please note that Springer uses effective methods and state-of-the art technology to detect, stop, and prosecute illegal sharing to safeguard our authors’ interests.
Download Chapter (799 KB)

Abstract

MapReduce is a popular large-scale data-parallel processing model. Its success has stimulated several studies of implementing MapReduce on Graphic Processing Unit (GPU). However, these studies focus most of their efforts on single-GPU algorithms and cannot handle large data sets which exceed GPU memory capacity. This paper describes an upgrade version of MGMR, a pipelined multi-GPU MapReduce system (PMGMR), which addresses the challenge of big data. PMGMR employs the power of multiple GPUs, improves GPU utilization using new GPU features such as streams and Hyper-Q, and handles large data sets which exceeds GPU and even CPU memory. Compared to MGMR, the newly proposed scheme achieves a 2.5-fold performance improvement and increases system scalability, while allowing users to write straight forward MapReduce code.