Skip to main content

Advertisement

Log in

The Design and Implementation of CoGaDB: A Column-oriented GPU-accelerated DBMS

  • Schwerpunktbeitrag
  • Published:
Datenbank-Spektrum Aims and scope Submit manuscript

Abstract

Nowadays, the performance of processors is primarily bound by a fixed energy budget, the power wall. This forces hardware vendors to optimize processors for specific tasks, which leads to an increasingly heterogeneous hardware landscape. Although efficient algorithms for modern processors such as GPUs are heavily investigated, we also need to prepare the database optimizer to handle computations on heterogeneous processors. GPUs are an interesting base for case studies, because they already offer many difficulties we will face tomorrow.

In this paper, we present CoGaDB, a main-memory DBMS with built-in GPU acceleration, which is optimized for OLAP workloads. CoGaDB uses the self-tuning optimizer framework HyPE to build a hardware-oblivious optimizer, which learns cost models for database operators and efficiently distributes a workload on available processors. Furthermore, CoGaDB implements efficient algorithms on CPU and GPU and efficiently supports star joins. We show in this paper, how these novel techniques interact with each other in a single system. Our evaluation shows that CoGaDB quickly adapts to the underlying hardware by increasing the accuracy of its cost models at runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://wwwiti.cs.uni-magdeburg.de/iti_db/research/gpu/cogadb/

  2. Many main-memory OLTP systems use a row-oriented data layout.

  3. We are aware of Accelerated Processing Units (APUs) from AMD, which integrate a CPU and a GPU on a single chip. However, APUs increase only the raw processing power of the machine, not memory bandwidth.

  4. https://www.threadingbuildingblocks.org.

  5. http://thrust.github.io.

  6. Hyper-threading was enabled during our experiments.

References

  1. Abadi D, Myers D, DeWitt D, Madden S. (2007) Materialization strategies in a column-oriented DBMS. In: ICDE, IEEE, pp 466–475

  2. Abadi DJ, Madden SR, Hachem N. (2008) Column-stores vs. row-stores: how different are they really? In: SIGMOD, ACM, pp 967–980

  3. Abadi D, Boncz P, Harizopoulos S, Idreos S, Madden S (2013) The design and implementation of modern column-oriented database systems. Foundations Trends in Databases 5(3):197–280

    Article  Google Scholar 

  4. Bakkum P, Chakradhar S (2012) Efficient data management for GPU databases. http://pbbakkum.com/virginian/paper.pdf

  5. Balkesen C, Alonso G, Teubner J, Özsu MT (2013) Multi-core, main-memory joins: sort vs. hash revisited. PVLDB 7(1):85–96

  6. Balkesen C, Teubner J, Alonso G, Özsu MT (2013) Main-memory hash joins on multi-core CPUs: tuning to the underlying hardware. In: ICDE, pp 362–373

  7. Boncz PA, Zukowski M, Nes N (2005) MonetDB/X100: hyper-pipelining query execution. In: CIDR, pp 225–237

  8. Borkar S, Chien AA (2011) The future of microprocessors. Commun ACM 54(5):67–77

    Article  Google Scholar 

  9. Breß S, Geist I, Schallehn E, Mory M, Saake G (2012) A framework for cost based optimization of hybrid CPU/GPU query plans in database systems. Control Cybernetics 41(4):715–742

    Google Scholar 

  10. Breß S, Beier F, Rauhe H, Sattler K-U, Schallehn E, Saake G (2013) Efficient co-processor utilization in database query processing. Information Systems 38(8):1084–1096

    Article  Google Scholar 

  11. Breß S, Heimel M, Saecker M, Köcher B, Markl V, Saake G (2014) Ocelot/HyPE: optimized data processing on heterogeneous hardware. PVLDB 7(13)

  12. Breß S, Siegmund N, Heimel M, Saecker M, Lauer T, Bellatreche L, Saake G (2014) Load-aware inter-co-processor parallelism in database query processing. Data & Knowledge Engineering. doi:10.1016/j.datak.2014.07.003

  13. Broneske D, Breß S, Heimel M, Saake G (2014) Toward hardware-sensitive database operations. In: EDBT, OpenProceedings.org, pp 229–234

  14. Broneske D, Breß S, Saake G (2014) Database scan variants on modern CPUs: a performance study. In: IMDM@VLDB

  15. Gray J et al (1997) Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min Knowl Disc 1(1):29–53

    Article  Google Scholar 

  16. Gregg C, Hazelwood K (2011) Where is the data? Why you cannot debate CPU vs. GPU performance without the answer. In: ISPASS, IEEE, pp 134–144

  17. He B, Lu M, Yang K, Fang R, Govindaraju NK, Luo Q, Sander PV (2009) Relational query co-processing on graphics processors. ACM Trans Database Syst 34:21

  18. Heimel M, Markl V (2012) A first step towards GPU-assisted query optimization. In: ADMS, pp 33–44

  19. Heimel M, Saecker M, Pirk H, Manegold S, Markl V (2013) Hardware-oblivious parallelism for in-memory column-stores. PVLDB 6(9):709–720

    Google Scholar 

  20. Heimel M, Haase F, Meinke M, Breß S, Saecker M, Markl V (2014) Demonstrating self-learning algorithm adaptivity in a hardware-oblivious database engine. In: EDBT, OpenProceedings.org, pp 616–619

  21. Idreos S, Groffen F, Nes N, Manegold S, Mullender KS, Kersten ML (2012) MonetDB: two decades of research in column-oriented database architectures. IEEE Data Eng Bull 35(1):40–45

    Google Scholar 

  22. Johnson R, Raman V, Sidle R, Swart G (2008) Row-wise parallel predicate evaluation. PVLDB 1(1):622–634

    Google Scholar 

  23. Leis V, Boncz P, Kemper A, Neumann T (2014) Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age. In: SIGMOD, ACM, pp 743–754

  24. Manegold S, Boncz PA, Kersten ML (2000) Optimizing database architecture for the new bottleneck: memory access. VLDB J 9(3):231–246

    Article  MATH  Google Scholar 

  25. Manegold S, Boncz P, Kersten ML (2002) Generic database cost models for hierarchical memory systems. In: PVLDB, VLDB Endowment, pp 191–202

  26. Manegold S, Boncz P, Nes N, Kersten M (2004) Cache-conscious radix-decluster projections. VLDB, VLDB Endowment, pp 684–695

  27. Markl V, Raman V, Simmen D, Lohman G, Pirahesh H, Cilimdzic M (2004) Robust query processing through progressive optimization. In: SIGMOD, ACM, pp 659–670

  28. Mühlbauer T, Rödiger W, Seilbeck R, Kemper A, Neumann T (2014) Heterogeneity-conscious parallel query execution: getting a better mileage while driving faster! In: DaMoN, ACM, pp 2:1–2:10

  29. Neumann T (2011) Efficiently compiling efficient query plans for modern hardware. PVLDB 4(9):539–550

    Google Scholar 

  30. NVIDIA. NVIDIA CUDA C Programming Guide. (2014) http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf. pp 31–36, Version 6.0. Accessed 18 May 2014

  31. O’Neil P, Graefe G (1995) Multi-table joins through bitmapped join indices. SIGMOD Rec 24(3):8–11

    Article  Google Scholar 

  32. O’Neil P, O’Neil EJ, Chen X (2009) The star schema benchmark (SSB), Revision 3. http://www.cs.umb.edu/~poneil/StarSchemaB.PDF

  33. Raman V, Swart G, Qiao L, Reiss F, Dialani V, Kossmann D, Narang I, Sidle R (2008) Constant-time query processing. In: ICDE, IEEE, pp 60–69

  34. Raman V et al (2013) DB2 with BLU acceleration: so much more than just a column store. PVLDB 6(11):1080–1091

  35. Stillger M, Lohman GM, Markl V, Kandil M (2001) LEO - DB2`s learning optimizer. In: VLDB, Morgan Kaufmann Publishers Inc., pp 19–28

  36. Wang K, Zhang K, Yuan Y, Ma S, Lee R, Ding X, Zhang X (2014) Concurrent analytical query processing with GPUs. PVLDB 7(11):1011–1022

    Google Scholar 

  37. Ye Y, Ross KA, Vesdapunt N (2011) Scalable aggregation on multicore processors. In: DaMoN, ACM, pp 1–9

  38. Yuan Y, Lee R, Zhang X (2013) The yin and yang of processing data warehousing queries on GPU devices. PVLDB 6(10):817–828

    Google Scholar 

  39. Zhang S, He J, He B, Lu M (2013) OmniDB: towards portable and efficient query processing on parallel CPU/GPU architectures. PVLDB 6(12):1374–1377

    Google Scholar 

  40. Zhou J, Ross KA (2002) Implementing database operations using SIMD instructions. In: SIGMOD, ACM, pp 145–156

Download references

Acknowledgement

We thank Jens Teubner from TU Dortmund University and Theo Härder from University of Kaiserslautern for their helpful feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Breß.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Breß, S. The Design and Implementation of CoGaDB: A Column-oriented GPU-accelerated DBMS. Datenbank Spektrum 14, 199–209 (2014). https://doi.org/10.1007/s13222-014-0164-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13222-014-0164-z

Keywords

Navigation