Scaling up MapReduce-based Big Data Processing on Multi-GPU systems

Jiang, Hai; Chen, Yi; Qiao, Zhi; Weng, Tien-Hsiung; Li, Kuan-Ching

doi:10.1007/s10586-014-0400-1

Scaling up MapReduce-based Big Data Processing on Multi-GPU systems

Published: 22 August 2014

Volume 18, pages 369–383, (2015)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Hai Jiang¹,
Yi Chen¹,
Zhi Qiao¹,
Tien-Hsiung Weng² &
…
Kuan-Ching Li²

1337 Accesses
54 Citations
3 Altmetric
Explore all metrics

Abstract

MapReduce is a popular data-parallel processing model encompassed with recent advances in computing technology and has been widely exploited for large-scale data analysis. The high demand on MapReduce has stimulated the investigation of MapReduce implementations with different architectural models and computing paradigms, such as multi-core clusters, Clouds, Cubieboards and GPUs. Particularly, current GPU-based MapReduce approaches mainly focus on single-GPU algorithms and cannot handle large data sets, due to the limited GPU memory capacity. Based on the previous multi-GPU MapReduce version MGMR, this paper proposes an upgrade version MGMR++ to eliminate GPU memory limitation and a pipelined version, PMGMR, to handle the Big Data challenge through both CPU memory and hard disks. MGMR++ is extended from MGMR with flexible C++ templates and CPU memory utilization, while PMGMR fine-tuned the performance through the latest GPU features such as streams and Hyper-Q as well as hard disk utilization. Compared to MGMR (Jiang et al., Cluster Computing 2013), the proposed schemes achieve about 2.5-fold performance improvement, increase system scalability, and allow programmers to write straightforward MapReduce code for Big Data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Big data analytics on Apache Spark

Article 13 October 2016

Big Data: An Introduction

Big data analytics in Cloud computing: an overview

Article Open access 06 August 2022

References

Jiang, H., Chen, Y., Qiao, Z., Li, K.-C., Ro, W., Gaudiot, J.-C.: Accelerating MapReduce framework on multi-GPU systems. Cluster Computing, pp. 1–9. Springer, Berlin (2013)
Google Scholar
Cubieboards: an Open ARM Mini PC, http://www.cubieboard.org 2014
CUDA Programming Guide 6.0, NVIDIA, 2014
Dean, Jeffrey, Ghemawa, Sanjay: MapReduce: simplied data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Chen, Y., Qiao, Z., Jiang, H., Li, K.-C., Ro, W.W.: MGMR: multi-GPU based MapReduce. Grid and Pervasive Computing. Lecture Notes in Computer Science, vol. 7861, pp. 433–442. Springer, Berlin (2013)
Chapter Google Scholar
Bollier, D., Firestone, C.M.: The Promise and Peril of Big Data. Communications and Society Program. Aspen Institute, Washington, DC (2010)
Google Scholar
Jinno, R., Seki, K., Uehara, K.: Parallel distributed trajectory pattern mining using MapReduce. In: Proceedings of IEEE 4th International Conference on Cloud Computing Technology and Science, pp. 269–273, 2012
Lee, D., Dinov, I., Dong, B., Gutman, B., Yanovsky, I., Toga, A.W.: CUDA optimization strategies for compute-and memory-bound neuroimaging algorithms. Comput. Methods Programs Biomed. 106, 175 (2012)
Article Google Scholar
Raina, R., Madhavan, A., Ng, A.D.: Large-scale deep unsupervised learning using graphics processors. In: Proceedings of the 26th International Conference on Machine Learning, Canada, 2009
Fadika, z., Dede, E., Hartog, J., Govindaraju, M.: Marla: Mapreduce for heterogeneous clusters. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 49–56, 2012
Stuart, J.A., Owens, J.D.: Multi-GPU MapReduce on GPU clusters. In: Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium, pp. 1068–1079, 2011
Foster, I., Kesselman, C.: The Grid 2: blueprint for a new computing infrastructure, Morgan Kaufmann, 2003
Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: Proceedings of 10th IEEE International Symposium on High Performance Distributed Computing, pp. 181–194, 2001
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Sebastopol (2012)
Google Scholar
Chen, L., Huo, X., Agrawal, G.: Accelerating MapReduce on a coupled CPU-GPU architecture. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 2012
Nakada, H., Ogawa, H., Kudoh, T.: Stream processing with big data: SSS-MapReduce. In: Proceedings of 2012 IEEE 4th International Conference on Cloud Computing Technology and Science, pp. 618–621, 2012
Ji, F., Ma, X.: Using shared memory to accelerate MapReduce on graphics processing units. In: Proceedings of the IEEE International Parallel & Distributed Processing Symposium, pp. 805–816, 2011
Chen, L., Agrawal, G.: Optimizing MapReduce for GPUs with effective shared memory usage. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, pp. 199–210, 2012
Shainer, G., Ayoub, A., Lui, P., Liu, T., Kagan, M., Troot, C.R., Scantlen, G., Crozier, P.S.: The development of Mellanox/NVIDIA GPU Direct over InfiniBand new model for GPU to GPU communications. Computer Science-Research and Development, pp. 267–273. Springer, Berlin (2011)
Google Scholar
Fang, Wenbin, He, Bingsheng, Luo, Qiong, Govindaraju, Naga K.: Mars: Accelerating MapReduce with Graphics Processors. IEEE Trans. Parallel Distrib. Syst. 22(4), 608–620 (2011)
Article Google Scholar
Elteir, M., Lin, H., Feng, W.C., Scogland, T.R.W: StreamMR: an optimized MapReduce framework for AMD GPUs. In: IEEE 17th International Conference on Parallel and Distributed Systems, pp. 364–371, 2011
Tuning CUDA Applications for Kepler, http://docs.nvidia.com/cuda/kepler-tuning-guide/
Nathan, B., Jared, H.: Thrust: a productivity-oriented library for CUDA. In: GPU Computing Gems: Jade Edition, Morgan Kaufmann, pp. 359–371, 2011
Xiaobo, L., Paul, L., Jonathan, S., John, S., Sze, W.P., Hanmao, S.: On the versatility of parallel sorting by regular sampling. Parallel Comput. 19(10), 1079–1103 (1993)
Article MATH MathSciNet Google Scholar
Bartosz, P.: A fast approximation algorithm for the subset-sum problem. Int. Trans. Oper. Res. 9(4), 437–459 (2002)
Article MATH MathSciNet Google Scholar
FERMI Compute Architecture White Paper, Nvidia
Shi, Y., Léon-Charles, T., De, M.B., Yves, M.: Optimized data fusion for kernal k-means clustering. IEEE Trans. Pattern Anal. Mach. Intell. 34(5), 1031–1039 (2012)
Article Google Scholar

Download references

Acknowledgments

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies or institutions. This research is based upon work partially supported by National Science Foundation/USA under grant No. 0959124, Ministry of Science and Technology/Taiwan (MOST)/Taiwan under grant MOST 103-2221-E-126-010-, The Providence University research project under grant PU102-11100-A12 and NVIDIA through CUDA Center Awards.

Author information

Authors and Affiliations

Department of Computer Science, Arkansas State University, Jonesboro, AR, USA
Hai Jiang, Yi Chen & Zhi Qiao
Department of Computer Science and Information Engineering, Providence University, Taichung, 43301, Taiwan
Tien-Hsiung Weng & Kuan-Ching Li

Authors

Hai Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Tien-Hsiung Weng
View author publications
You can also search for this author in PubMed Google Scholar
Kuan-Ching Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuan-Ching Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, H., Chen, Y., Qiao, Z. et al. Scaling up MapReduce-based Big Data Processing on Multi-GPU systems. Cluster Comput 18, 369–383 (2015). https://doi.org/10.1007/s10586-014-0400-1

Download citation

Received: 23 August 2013
Revised: 21 May 2014
Accepted: 08 August 2014
Published: 22 August 2014
Issue Date: March 2015
DOI: https://doi.org/10.1007/s10586-014-0400-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scaling up MapReduce-based Big Data Processing on Multi-GPU systems

Abstract

Access this article

Similar content being viewed by others

Big data analytics on Apache Spark

Big Data: An Introduction

Big data analytics in Cloud computing: an overview

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scaling up MapReduce-based Big Data Processing on Multi-GPU systems

Abstract

Access this article

Similar content being viewed by others

Big data analytics on Apache Spark

Big Data: An Introduction

Big data analytics in Cloud computing: an overview

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation