Skip to main content

A Middleware Framework for Programmable Multi-GPU-Based Big Data Applications

  • Chapter
  • First Online:
GPU Computing and Applications

Abstract

Current application of GPU processors for parallel computing tasks shows excellent results in terms of speedups compared to CPU processors. However, there is no existing middleware framework that enables automatic distribution of data and processing across heterogeneous computing resources for structured and unstructured Big Data applications. Thus, we propose a middleware framework for “Big Data” analytics that provides mechanisms for automatic data segmentation, distribution, execution, information retrieval across multiple cards (CPU and GPU) and machines, a modular design for easy addition of new GPU kernels at both analytic and processing layer, and information presentation. The architecture and components of the framework such as multi-card data distribution and execution, data structures for efficient memory access, algorithms for parallel GPU computation, and results for various test configurations are shown. Our results show proposed middleware framework, providing alternative and cheaper HPC solution to users. Data cleansing algorithms on GPU show a speedup of over two orders of magnitude compared to the same operation done in MySQL on a multi-core machine. Our framework is also capable of processing more than 120 million of health data within 11 s.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It is a CUDA binding for the Java language, which exploiting the features of NVIDIA GPU computing from Java-based applications.

  2. 2.

    Parallel Thread Execution (PTX) is a pseudo-assembly language used in NVIDIA CUDA programming environment.

References

  1. Fang, W., et al.: Parallel data mining on graphics processors. Technical Report (2008)

    Google Scholar 

  2. Gregg, C., Hazelwood, K.: Where is the data? Why you cannot debate CPU vs. GPU performance without the answer. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 134–144 (2011). doi:10.1109/ISPASS.2011.5762730

  3. Bakkum, P., Skadron, K.: Accelerating SQL database operations on a GPU with CUD. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 94–103. New York, NY: ACM (2010). ISBN: 978-1-60558-935-0, doi:10.1145/1735688.1735706

  4. He, B., et al.: Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 260–269. New York, NY: ACM (2008). ISBN: 978-1-60558-282-5, doi:10.1145/1454115.1454152.

  5. Dean, J., Ghemawat, S. MapReduce: simplified data processing on large clusters. Communications of the ACM, vol. 51, pp. 107–113. New York, NY: ACM (2008). ISSN: 0001-0782, doi:10.1145/1327452.1327492

  6. Wolfe Gordon, A., Lu, P.: Elastic phoenix: Malleable MapReduce for shared-memory systems. In: Altman, E., Shi, W. (eds.) Network and Parallel Computing, vol. 6985, pp. 1–16. Springer, Heidelberg (2011)

    Google Scholar 

  7. Hong, C., et al.: MapCG: writing parallel program portable between CPU and GPU. In: Proceedings of the 19th international conference on Parallel architectures and compilation techniques, pp. 217–226. New York, NY: ACM (2010). ISBN: 978-1-4503-0178-7, doi:10.1145/1854273.1854303

  8. Shirahata, K., Sato, H., Matsuoka, S.: Hybrid map task scheduling for GPU-based heterogeneous clusters. In: IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 733–740 (2010). doi:10.1109/CloudCom.2010.55

  9. Stuart, J. A., Owens, J. D.: Multi-GPU MapReduce on GPU clusters. IEEE Computer Society. In: Proceedings of the 2011 I.E. International Parallel & Distributed Processing Symposium, pp. 1068–1079. Washington, DC. (2011). ISBN: 978-0-7695-4385-7, doi:10.1109/IPDPS.2011.102

  10. Catanzaro, B., Sundaram, N., Keutzer, K.: A map reduce framework for programming graphics processors. In: Third Workshop on Software Tools for MultiCore Systems (STMCS) (2008)

    Google Scholar 

  11. Choksuchat, C., Chantrapornchai, C.: Experimental framework for searching large RDF on GPUs based on key-value storage. In: 10th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 171–176 (2013). doi:10.1109/JCSSE.2013.6567340

  12. NVIDIA Corporation. OpenACC. https://developer.nvidia.com/openacc (2011). Accessed 4 Aug 2013

  13. Wolfe, M.: Implementing the PGI accelerator model. New York, NY: ACM. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 43–50 (2010). ISBN: 978-1-60558-935-0, doi:10.1145/1735688.1735697

  14. Ghosh, S., et al.: Experiences with OpenMP, PGI, HMPP and OpenACC directives on ISO/TTI Kernels. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 691–700 (2012). doi:10.1109/SC.Companion.2012.95

  15. Munshi, A.: The OpenCL specification. Khronos OpenCL Working Group. Technical Report (2009)

    Google Scholar 

  16. Torres, Y., Gonzalez-Escribano, A., Llanos, D.R.: Using fermi architecture knowledge to speed up CUDA and OpenCL programs. In: IEEE 10th International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 617–624 (2012). doi:10.1109/ISPA.2012.92

  17. Wezowicz, M., Taufer, M.: On the cost of a general GPU framework: the strange case of CUDA 4.0 vs. CUDA 5.0. In: High Performance Computing, Networking, Storage and Analysis (SCC), SC Companion, pp. 1535–1536 (2012). doi:10.1109/SC.Companion.2012.310

  18. Shen, J., et al.: Performance traps in OpenCL for CPUs. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 38–45 (2013). ISSN: 1066-6192, doi:10.1109/PDP.2013.16

  19. NVIDIA Corporation.: CUDA C Programming Guide. s.l. NVIDIA Corporation (2012)

    Google Scholar 

  20. Sanders, J., Kandrot, E.: CUDA by example: an introduction to general-purpose GPU programming. Addison-Wesley Professional. (2010). ISBN: 0131387685

    Google Scholar 

  21. Wilt, N.: CUDA handbook: a comprehensive guide to GPU programming. Addison-Wesley Professional, (2013). ISBN: 0321809467

    Google Scholar 

  22. Kirk, D.B., Hwu, W-m.W.: Programming massively parallel processors, second edition: a hands-on approach. Morgan Kaufmann, Burlington, MA (2012). ISBN: 0124159923

    Google Scholar 

  23. Hollis, C.: IDC digital universe study: Big data is here, now what? http://chucksblog.emc.com/chucks_blog/2011/06/2011-idc-digital-universe-study-big-data-is-here-now-what.html (2011). Accessed 18 July 2013

  24. Storm—Distributed and fault-tolerant realtime computation. http://storm-project.net/ (2011). Accessed 10 Aug 2013

  25. Impala—The platform for big data. http://www.cloudera.com/ (2013). Accessed 10 Aug 2013

  26. Holton, G.A.: Value at risk: theory and practice. Academic Press, Amsterdam (2003). ISBN: 0123540100

    Google Scholar 

  27. Navarro, G. A guided tour to approximate string matching. ACM computing surveys, vol. 33, pp. 31–88. New York, NY: ACM. (2001). ISSN: 0360-0300, doi:10.1145/375360.375365

Download references

Acknowledgement

This research was done under joint lab of “NVIDIA-HP-MIMOS GPU R&D and Solution Center.” This is the first GPU solution center in Southeast Asia established in October 2012. Funding for the work came from MOSTI, Malaysia. The authors would like to thank Prof. Simon See and Pradeep Gupta from NVIDIA for their support.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ettikan K. Karuppiah , Yong Keh Kok or Keeratpal Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media Singapore

About this chapter

Cite this chapter

Karuppiah, E.K., Kok, Y.K., Singh, K. (2015). A Middleware Framework for Programmable Multi-GPU-Based Big Data Applications. In: Cai, Y., See, S. (eds) GPU Computing and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-287-134-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-981-287-134-3_12

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-287-133-6

  • Online ISBN: 978-981-287-134-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics