A Middleware Framework for Programmable Multi-GPU-Based Big Data Applications

Karuppiah, Ettikan K.; Kok, Yong Keh; Singh, Keeratpal

doi:10.1007/978-981-287-134-3_12

Ettikan K. Karuppiah³,
Yong Keh Kok³ &
Keeratpal Singh³

2602 Accesses

Abstract

Current application of GPU processors for parallel computing tasks shows excellent results in terms of speedups compared to CPU processors. However, there is no existing middleware framework that enables automatic distribution of data and processing across heterogeneous computing resources for structured and unstructured Big Data applications. Thus, we propose a middleware framework for “Big Data” analytics that provides mechanisms for automatic data segmentation, distribution, execution, information retrieval across multiple cards (CPU and GPU) and machines, a modular design for easy addition of new GPU kernels at both analytic and processing layer, and information presentation. The architecture and components of the framework such as multi-card data distribution and execution, data structures for efficient memory access, algorithms for parallel GPU computation, and results for various test configurations are shown. Our results show proposed middleware framework, providing alternative and cheaper HPC solution to users. Data cleansing algorithms on GPU show a speedup of over two orders of magnitude compared to the same operation done in MySQL on a multi-core machine. Our framework is also capable of processing more than 120 million of health data within 11 s.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
It is a CUDA binding for the Java language, which exploiting the features of NVIDIA GPU computing from Java-based applications.
2.
Parallel Thread Execution (PTX) is a pseudo-assembly language used in NVIDIA CUDA programming environment.

References

Fang, W., et al.: Parallel data mining on graphics processors. Technical Report (2008)
Google Scholar
Gregg, C., Hazelwood, K.: Where is the data? Why you cannot debate CPU vs. GPU performance without the answer. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 134–144 (2011). doi:10.1109/ISPASS.2011.5762730
Bakkum, P., Skadron, K.: Accelerating SQL database operations on a GPU with CUD. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 94–103. New York, NY: ACM (2010). ISBN: 978-1-60558-935-0, doi:10.1145/1735688.1735706
He, B., et al.: Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 260–269. New York, NY: ACM (2008). ISBN: 978-1-60558-282-5, doi:10.1145/1454115.1454152.
Dean, J., Ghemawat, S. MapReduce: simplified data processing on large clusters. Communications of the ACM, vol. 51, pp. 107–113. New York, NY: ACM (2008). ISSN: 0001-0782, doi:10.1145/1327452.1327492
Wolfe Gordon, A., Lu, P.: Elastic phoenix: Malleable MapReduce for shared-memory systems. In: Altman, E., Shi, W. (eds.) Network and Parallel Computing, vol. 6985, pp. 1–16. Springer, Heidelberg (2011)
Google Scholar
Hong, C., et al.: MapCG: writing parallel program portable between CPU and GPU. In: Proceedings of the 19th international conference on Parallel architectures and compilation techniques, pp. 217–226. New York, NY: ACM (2010). ISBN: 978-1-4503-0178-7, doi:10.1145/1854273.1854303
Shirahata, K., Sato, H., Matsuoka, S.: Hybrid map task scheduling for GPU-based heterogeneous clusters. In: IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 733–740 (2010). doi:10.1109/CloudCom.2010.55
Stuart, J. A., Owens, J. D.: Multi-GPU MapReduce on GPU clusters. IEEE Computer Society. In: Proceedings of the 2011 I.E. International Parallel & Distributed Processing Symposium, pp. 1068–1079. Washington, DC. (2011). ISBN: 978-0-7695-4385-7, doi:10.1109/IPDPS.2011.102
Catanzaro, B., Sundaram, N., Keutzer, K.: A map reduce framework for programming graphics processors. In: Third Workshop on Software Tools for MultiCore Systems (STMCS) (2008)
Google Scholar
Choksuchat, C., Chantrapornchai, C.: Experimental framework for searching large RDF on GPUs based on key-value storage. In: 10th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 171–176 (2013). doi:10.1109/JCSSE.2013.6567340
NVIDIA Corporation. OpenACC. https://developer.nvidia.com/openacc (2011). Accessed 4 Aug 2013
Wolfe, M.: Implementing the PGI accelerator model. New York, NY: ACM. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 43–50 (2010). ISBN: 978-1-60558-935-0, doi:10.1145/1735688.1735697
Ghosh, S., et al.: Experiences with OpenMP, PGI, HMPP and OpenACC directives on ISO/TTI Kernels. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 691–700 (2012). doi:10.1109/SC.Companion.2012.95
Munshi, A.: The OpenCL specification. Khronos OpenCL Working Group. Technical Report (2009)
Google Scholar
Torres, Y., Gonzalez-Escribano, A., Llanos, D.R.: Using fermi architecture knowledge to speed up CUDA and OpenCL programs. In: IEEE 10th International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 617–624 (2012). doi:10.1109/ISPA.2012.92
Wezowicz, M., Taufer, M.: On the cost of a general GPU framework: the strange case of CUDA 4.0 vs. CUDA 5.0. In: High Performance Computing, Networking, Storage and Analysis (SCC), SC Companion, pp. 1535–1536 (2012). doi:10.1109/SC.Companion.2012.310
Shen, J., et al.: Performance traps in OpenCL for CPUs. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 38–45 (2013). ISSN: 1066-6192, doi:10.1109/PDP.2013.16
NVIDIA Corporation.: CUDA C Programming Guide. s.l. NVIDIA Corporation (2012)
Google Scholar
Sanders, J., Kandrot, E.: CUDA by example: an introduction to general-purpose GPU programming. Addison-Wesley Professional. (2010). ISBN: 0131387685
Google Scholar
Wilt, N.: CUDA handbook: a comprehensive guide to GPU programming. Addison-Wesley Professional, (2013). ISBN: 0321809467
Google Scholar
Kirk, D.B., Hwu, W-m.W.: Programming massively parallel processors, second edition: a hands-on approach. Morgan Kaufmann, Burlington, MA (2012). ISBN: 0124159923
Google Scholar
Hollis, C.: IDC digital universe study: Big data is here, now what? http://chucksblog.emc.com/chucks_blog/2011/06/2011-idc-digital-universe-study-big-data-is-here-now-what.html (2011). Accessed 18 July 2013
Storm—Distributed and fault-tolerant realtime computation. http://storm-project.net/ (2011). Accessed 10 Aug 2013
Impala—The platform for big data. http://www.cloudera.com/ (2013). Accessed 10 Aug 2013
Holton, G.A.: Value at risk: theory and practice. Academic Press, Amsterdam (2003). ISBN: 0123540100
Google Scholar
Navarro, G. A guided tour to approximate string matching. ACM computing surveys, vol. 33, pp. 31–88. New York, NY: ACM. (2001). ISSN: 0360-0300, doi:10.1145/375360.375365

Download references

Acknowledgement

This research was done under joint lab of “NVIDIA-HP-MIMOS GPU R&D and Solution Center.” This is the first GPU solution center in Southeast Asia established in October 2012. Funding for the work came from MOSTI, Malaysia. The authors would like to thank Prof. Simon See and Pradeep Gupta from NVIDIA for their support.

Author information

Authors and Affiliations

MIMOS Berhad, Kuala Lumpur, Malaysia
Ettikan K. Karuppiah, Yong Keh Kok & Keeratpal Singh

Authors

Ettikan K. Karuppiah
View author publications
You can also search for this author in PubMed Google Scholar
Yong Keh Kok
View author publications
You can also search for this author in PubMed Google Scholar
Keeratpal Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ettikan K. Karuppiah , Yong Keh Kok or Keeratpal Singh .

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore, Singapore
Yiyu Cai
Nvidia, Singapore, Singapore
Simon See

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Karuppiah, E.K., Kok, Y.K., Singh, K. (2015). A Middleware Framework for Programmable Multi-GPU-Based Big Data Applications. In: Cai, Y., See, S. (eds) GPU Computing and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-287-134-3_12

Download citation

DOI: https://doi.org/10.1007/978-981-287-134-3_12
Published: 22 September 2014
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-133-6
Online ISBN: 978-981-287-134-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics