Skip to main content

GPU-Based Aggregation of On-Line Analytical Processing

  • Conference paper
Communications and Information Processing

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 288))

  • 2031 Accesses

Abstract

OLAP (On-Line Analytical Processing) is data and compute intensive application, how to improve the performance of OLAP are researchers always pursued goal. Aggregation is one of high frequently used operations which have a great impact on OLAP performance. Modern GPU (Graphic Process Units) have more raw computing power and higher memory bandwidth, so utilizing GPU accelerating aggregation computation is straight forward. But now GPU equipment does not supports float atomic operation and incremental memory allocation, so GPU algorithm need to be well-designed. In this paper, we discuss real-time aggregation in OLAP based on dense and sparse dataset, which fully utilize the high parallelism and high memory bandwidth and achieve performance improvements approximately 20X over CPU-based algorithms. On dense dataset, source data are chunked based on shared memory size, each thread block processes one chunk, each thread in block computes one cell in chunk cuboid. Algorithms adapts to GPU architecture and high parallelism which ensure high performance of algorithms. But on sparse dataset, there is a complex relationship between the compression dataset and the unknown size of result cuboid, it is impossible to define a straightforward parallelization. So we utilize sort, map and prefix sum primitive finishing source data partition, and reduction primitive aggregation data. At last, we introduce prototype system GPUOLAP (GPU-based OLAP) architecture which is under development now. Our work is a good attempt to real-time OLAP using new hardware.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The BI Survey Analyzer, http://www.bi-survey.com/

  2. Zhao, Y., Deshpande, P.M., Naughton, J.F.: An Array-based Algorithm for Simultaneous Multidimensional Aggregates. In: SIGMOD 1997, pp. 159–170. ACM Press, New York (1997)

    Chapter  Google Scholar 

  3. Beyer, K., Ramakrishnan, R.: Bottom-up Computation of Sparse and Iceberg CUBEs. In: SIGMOD 1999, pp. 359–370. ACM Press, New York (1999)

    Chapter  Google Scholar 

  4. Xin, D., Han, J.W., Li, X.L., Wah, B.W.: Star-Cubing: Computing Iceberg Cubes by Top-down and Bottom-up Integration. In: 29th International Conference on Very Large Data Bases, pp. 476–487. Morgan Kaufmann Publishers, San Francisco (2003)

    Google Scholar 

  5. Shao, Z., Han, J.W., Xin, D.: MM-Cubing: Computing Iceberg Cubes by Factorizing the Lattice Space. In: 16th International Conference on Scientific and Statistical Database Management, pp. 213–222. IEEE Computer Society, Washington (2004)

    Chapter  Google Scholar 

  6. Hurtado, C.A., Mendelzon, A.O., Vaisman, A.A.: Maintaining Data Cubes Under dimension Updates. In: 15th International Conference on Data Engineering, pp. 346–355. IEEE Computer Society, Washington (1999)

    Google Scholar 

  7. Lee, K.Y., Kim, M.H.: Efficient Incremental Maintenance of Data Cubes. In: 32th International Conference on Very Large Data Bases, pp. 823–833. ACM Press, New York (2006)

    Google Scholar 

  8. Dehne, F., Eavis, T., Hambrusch, S., Rau-Chaplin, A.: Parallelizing the Data CUBE. Distributed and Parallel Databases 11(2), 181–201 (2002)

    MATH  Google Scholar 

  9. Dehne, F., Eavis, T., Rau-Chaplin, A.: Cluster Architecture for Parallel Data Warehousing. In: IEEE International Conference on Cluster Computing and the Grid, CCGrid 2001, Brisbane, Australia, pp. 161–168 (2001)

    Google Scholar 

  10. Ng, R., Wagner, A., Yin, Y.: Iceberg-cube Computation with PC Clusters. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, SIGMOD 2001, pp. 25–36. ACM Press, California (2001)

    Chapter  Google Scholar 

  11. Lakshmanan, L.V.S., Russakovsky, A., Sashikanth, V.: What-if OLAP Queries with Changing Dimensions. In: 24th International Conference on Data Engineering, pp. 1334–1336. IEEE Press, Cancun (2008)

    Chapter  Google Scholar 

  12. Real-time OLAP, http://www.sia.com.br/rtolap.htm

  13. Ailamaki, A., DeWitt, D.J., Hill, M.D.: Data Page Layouts for Relational Databases on Deep Memory Hierarchies. The VLDB Journal 11(3), 198–215 (2002)

    Article  MATH  Google Scholar 

  14. OpenMP, http://www.openmp.org/

  15. Bingsheng, H., Ke, Y., Rui, F.: Relational Joins on Graphics Processors. In: SIGMOD 2008, pp. 511–524. ACM Press, New York (2008)

    Google Scholar 

  16. Ma, W., Agrawal, G.: A Translation System for Enabling Data Mining Applications on GPUs. In: 23th International Conference on Supercomputing, pp. 400–409. ACM Press, New York (2009)

    Google Scholar 

  17. Programming Guide NVIDIA CUDA Compute Unified Device Architecture Version 2.0 (July 6, 2008)

    Google Scholar 

  18. Govindaraju, N., Gray, J., Kumar, R., Manocha, D.: GPUTeraSort: High Performance Graphics Coprocessor Sorting for Large Database Management. In: SIGMOD 2006, pp. 325–336. ACM Press, Chicago (2006)

    Chapter  Google Scholar 

  19. CUDPP: CUDA Data Parallel Primitives Library, http://www.gpgpu.org/developer/cudpp/

  20. Satish, N., Harris, M., Garland, M.: Designing Efficient Sorting Algorithms for Manycore GPUs. In: 23rd IEEE Intel Parallel & Distributed Processing Symposium. IEEE Press, Rome (2009)

    Google Scholar 

  21. Govindaraju, N.K., Raghuvanshi, N., Henson, M., Tuft, D., Manocha, D.: A Cache-Efficient Sorting Algorithm for Database and Data Mining Computations using Graphics Processors. Technical report, TR05-016 (2005)

    Google Scholar 

  22. Fang, W., Lu, M., Xiao, X., He, B., Luo, Q.: Frequent Itemset Mining on Graphics Processors. In: 5th International Workshop on Data Management on New Hardware, pp. 34–42. ACM Press, New York (2009)

    Chapter  Google Scholar 

  23. He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N.K., Luo, Q., Sander, P.V.: Relational Query Coprocessing on Graphics Processors. ACM Transaction, Database System, 1–39 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, G., Zhou, G. (2012). GPU-Based Aggregation of On-Line Analytical Processing. In: Zhao, M., Sha, J. (eds) Communications and Information Processing. Communications in Computer and Information Science, vol 288. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31965-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31965-5_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31964-8

  • Online ISBN: 978-3-642-31965-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics