Exploring the Design Space of a GPU-Aware Database Architecture

  • Sebastian Breß
  • Max Heimel
  • Norbert Siegmund
  • Ladjel Bellatreche
  • Gunter Saake
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 241)


The vast amount of processing power and memory bandwidth provided by modern graphics cards make them an interesting platform for data-intensive applications. Unsurprisingly, the database research community has identified GPUs as effective co-processors for data processing several years ago. In the past years, there were many approaches to make use of GPUs at different levels of a database system. In this paper, we summarize the major findings of the literature on GPU-accelerated data processing. Based on this survey, we present key properties, important trade-offs and typical challenges of GPU-aware database architectures, and identify major open research questions.


Design Space Query Processing Graphic Card Processing Device Query Optimization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Palo gpu accelerator. White Paper (2010)Google Scholar
  2. 2.
    Bakkum, P., Chakradhar, S.: Efficient data management for gpu databases (2012),
  3. 3.
    Bakkum, P., Skadron, K.: Accelerating sql database operations on a gpu with cuda. In: GPGPU, pp. 94–103. ACM (2010)Google Scholar
  4. 4.
    Boncz, P.A., Kersten, M.L., Manegold, S.: Breaking the memory wall in monetdb. Commun. ACM 51(12), 77–85 (2008)CrossRefGoogle Scholar
  5. 5.
    Breß, S., Beier, F., Rauhe, H., Sattler, K.-U., Schallehn, E., Saake, G.: Efficient co-processor utilization in database query processing. Information Systems (2013),
  6. 6.
    Breß, S., Geist, I., Schallehn, E., Mory, M., Saake, G.: A framework for cost based optimization of hybrid cpu/gpu query plans in database systems. Control and Cybernetics 41(4) (2012)Google Scholar
  7. 7.
    Breß, S., Mohammad, S., Schallehn, E.: Self-tuning distribution of db-operations on hybrid cpu/gpu platforms. In: GvD. CEUR-WS, pp. 89–94 (2012)Google Scholar
  8. 8.
    Breß, S., Schallehn, E., Geist, I.: Towards optimization of hybrid CPU/GPU query plans in database systems. In: Pechenizkiy, M., Wojciechowski, M. (eds.) New Trends in Databases & Inform. AISC, vol. 185, pp. 27–35. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Diamos, G., Wu, H., Lele, A., Wang, J., Yalamanchili, S.: Efficient relational algebra algorithms and data structures for gpu. Technical report, Center for Experimental Research in Computer Systems, CERS (2012)Google Scholar
  10. 10.
    Fang, R., He, B., Lu, M., Yang, K., Govindaraju, N.K., Luo, Q., Sander, P.V.: Gpuqp: query co-processing using graphics processors. In: SIGMOD, pp. 1061–1063. ACM (2007)Google Scholar
  11. 11.
    Ghodsnia, P.: An in-gpu-memory column-oriented database for processing analytical workloads. In: The VLDB PhD Workshop. VLDB Endowment (2012)Google Scholar
  12. 12.
    Graefe, G.: Encapsulation of parallelism in the volcano query processing system. In: SIGMOD, pp. 102–111. ACM (1990)Google Scholar
  13. 13.
    Gregg, C., Hazelwood, K.: Where is the data? why you cannot debate cpu vs. gpu performance without the answer. In: ISPASS, pp. 134–144. IEEE (2011)Google Scholar
  14. 14.
    He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N.K., Luo, Q., Sander, P.V.: Relational query co-processing on graphics processors. ACM Trans. Database Syst. 34, 21:1–21:39 (2009)Google Scholar
  15. 15.
    He, B., Yang, K., Fang, R., Lu, M., Govindaraju, N., Luo, Q., Sander, P.: Relational joins on graphics processors. In: SIGMOD, pp. 511–524. ACM (2008)Google Scholar
  16. 16.
    He, B., Yu, J.X.: High-throughput transaction executions on graphics processors. PVLDB 4(5), 314–325 (2011)MathSciNetGoogle Scholar
  17. 17.
    Heimel, M., Markl, V.: A first step towards gpu-assisted query optimization. In: ADMS. VLDB Endowment (2012)Google Scholar
  18. 18.
    Heimel, M., Saecker, M., Pirk, H., Manegold, S., Markl, V.: Hardware-oblivious parallelism for in-memory column-stores. In: VLDB. VLDB Endowment (2013)Google Scholar
  19. 19.
    Kossmann, D.: The state of the art in distributed query processing. ACM Computing Surveys 32(4), 422–469 (2000)CrossRefGoogle Scholar
  20. 20.
    Manegold, S., Boncz, P.A., Kersten, M.L.: Optimizing database architecture for the new bottleneck: Memory access. The VLDB Journal 9(3), 231–246 (2000)zbMATHCrossRefGoogle Scholar
  21. 21.
    Manegold, S., Kersten, M.L., Boncz, P.: Database architecture evolution: Mammals flourished long before dinosaurs became extinct. PVLDB 2(2), 1648–1653 (2009)Google Scholar
  22. 22.
    NVIDIA. Nvidia cuda c programming guide, pp. 30–34 (2012), (accessed February 16, 2013)
  23. 23.
    Pirk, H.: Efficient cross-device query processing. In: The VLDB PhD Workshop. VLDB Endowment (2012)Google Scholar
  24. 24.
    Pirk, H., Manegold, S., Kersten, M.: Accelerating foreign-key joins using asymmetric memory channels. In: ADMS, pp. 585–597. VLDB Endowment (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sebastian Breß
    • 1
  • Max Heimel
    • 2
  • Norbert Siegmund
    • 1
  • Ladjel Bellatreche
    • 3
  • Gunter Saake
    • 1
  1. 1.School of Computer ScienceUniversity of MagdeburgMagdeburgGermany
  2. 2.Technische Universität BerlinBerlinGermany
  3. 3.LIAS/ISAE-ENSMA, FuturoscopePoitiersFrance

Personalised recommendations