Abstract
MapReduce is a popular large-scale data-parallel processing model. Its success has stimulated several studies of implementing MapReduce on Graphic Processing Unit (GPU). However, these studies focus most of their efforts on single-GPU algorithms and cannot handle large data sets which exceed GPU memory capacity. This paper describes an upgrade version of MGMR, a pipelined multi-GPU MapReduce system (PMGMR), which addresses the challenge of big data. PMGMR employs the power of multiple GPUs, improves GPU utilization using new GPU features such as streams and Hyper-Q, and handles large data sets which exceeds GPU and even CPU memory. Compared to MGMR, the newly proposed scheme achieves a 2.5-fold performance improvement and increases system scalability, while allowing users to write straight forward MapReduce code.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bollier, D., Firestone, C.M.: The promise and peril of big data. Aspen Institute, Communications and Society Program (2010)
Chen, L., Agrawal, G.: Optimizing mapreduce for gpus with effective shared memory usage. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, pp. 199–210 (2012)
Chen, L., Huo, X., Agrawal, G.: Accelerating mapreduce on a coupled cpu-gpu architecture. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 25 (2012)
Chen, Y., Qiao, Z., Jiang, H., Li, K.C., Ro, W.W.: Mgmr: Multi-gpu based mapreduce. In: To Appear in Proceedings of the 8th International Conference on Grid and Pervasive Computing (2013)
Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: Proceedings of 10th IEEE International Symposium on High Performance Distributed Computing, pp. 181–194 (2001)
Dean, J., Ghemawa, S.: Mapreduce: Simplied data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)
Dinov, I.D.: Cuda optimization strategies for compute-and memory-bound neuroimaging algorithms. Computer Methods and Programs in Biomedicine (2011)
Fadika, Z., Dede, E., Hartog, J., Govindaraju, M.: Marla: Mapreduce for heterogeneous clusters. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 49–56 (2012)
Foster, I., Kesselman, C.: The grid 2: Blueprint for a new computing infrastructure. Morgan Kaufmann (2003)
Ji, F., Ma, X.: Using shared memory to accelerate mapreduce on graphics processing units. In: Proceedings of the IEEE International Parallel & Distributed Processing Symposium, pp. 805–816 (2011)
Jinno, R., Seki, K., Uehara, K.: Parallel distributed trajectory pattern mining using mapreduce. In: IEEE 4th International Conference on Cloud Computing Technology and Science, pp. 269–273 (2012)
Nakada, H., Ogawa, H., Kudoh, T.: Stream processing with bigdata: Sss-mapreduce. In: Proceedings of 2012 IEEE 4th International Conference on Cloud Computing Technology and Science, pp. 618–621 (2012)
Shainer, G., Lui, P., Liu, T.: The development of mellanox/nvidia gpu direct over infinibanda new model for gpu to gpu communications. In: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, vol. 26, pp. 267–273 (2011)
Stuart, J.A., Owens, J.D.: Multi-gpu mapreduce on gpu clusters. In: Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium, pp. 1068–1079 (2011)
White, T.: Hadoop: The Definitive Guide. O’Reilly Media (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, Y., Qiao, Z., Davis, S., Jiang, H., Li, KC. (2013). Pipelined Multi-GPU MapReduce for Big-Data Processing. In: Lee, R. (eds) Computer and Information Science. Studies in Computational Intelligence, vol 493. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00804-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-00804-2_17
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00803-5
Online ISBN: 978-3-319-00804-2
eBook Packages: EngineeringEngineering (R0)