, Volume 19, Issue 3, pp 183–197 | Cite as

Evaluating the Vector Supercomputer SX-Aurora TSUBASA as a Co-Processor for In-Memory Database Systems

  • Johannes Pietrzyk
  • Dirk HabichEmail author
  • Patrick Damme
  • Erich Focht
  • Wolfgang Lehner


In-memory column-store database systems are state of the art for the efficient processing of analytical workloads. In these systems, data compression as well as vectorization play an important role. Currently, the vectorized processing is done using regular SIMD (Single Instruction Multiple Data) extensions of modern processors. For example, Intel’s latest SIMD extension supports 512-bit vector registers which allows the parallel processing of 8× 64-bit values. From a database system perspective, this vectorization technique is not only very interesting for compression and decompression to reduce the computational overhead, but also for all database operators like joins, scan, as well as groupings. In contrast to these SIMD extensions, NEC Corporation has recently introduced a novel pure vector engine (supercomputer) as a co-processor called SX-Aurora TSUBASA. This vector engine features a vector length of 16.384 bits with the world’s highest bandwidth of up to 1.2 TB/s, which perfectly fits to data-intensive applications like in-memory database systems. Therefore, we describe the unique architecture and properties of this novel vector engine in this paper. Moreover, we present selected in-memory column-store-specific evaluation results to show the benefits of this vector engine compared to regular SIMD extensions. Finally, we conclude the paper with an outlook on our ongoing research activities in this direction.


Vectorization NEC SX-Aurora TSUBASA Column stores Experimental evaluation SIMD extension 



This work was funded by NEC Corporation within the project Highly vectorized query processing on compressed columnar data.


  1. 1.
    Abadi D, Boncz PA, Harizopoulos S, Idreos S, Madden S (2013) The design and implementation of modern column-oriented database systems. Found Trends Databases 5(3):197–280CrossRefGoogle Scholar
  2. 2.
    Abadi DJ, Madden S, Ferreira M (2006) Integrating compression and execution in column-oriented database systems. In: SIGMOD, pp 671–682. ACM: New YorkGoogle Scholar
  3. 3.
    Binnig C, Hildenbrand S, Färber F (2009) Dictionary-based order-preserving string compression for main memory column stores. In: SIGMOD, pp 283–296. ACM: New YorkGoogle Scholar
  4. 4.
    Boncz PA, Kersten ML, Manegold S (2008) Breaking the memory wall in monetdb. Commun ACM 51(12):77–85CrossRefGoogle Scholar
  5. 5.
    Chen Z, Gehrke J, Korn F (2001) Query optimization in compressed database systems. In: SIGMOD, pp 271–282. ACM: New YorkGoogle Scholar
  6. 6.
    Copeland GP, Khoshafian S (1985) A decomposition storage model. In: SIGMOD, pp 268–279. ACM: New YorkGoogle Scholar
  7. 7.
    Damme P (2017) Query processing based on compressed intermediates. VLDB PhD Workshop. Munich, 28.08.2017Google Scholar
  8. 8.
    Damme P, Habich D, Hildebrandt J, Lehner W (2017) Lightweight data compression algorithms: An experimental survey (experiments and analyses). In: EDBT, pp 72–83. Venice, 21–24.03.2017Google Scholar
  9. 9.
    Damme P, Habich D, Lehner W (2015) Direct transformation techniques for compressed data: General approach and application scenarios. In: ADBIS, pp 151–165. SpringerGoogle Scholar
  10. 10.
    Damme P, Ungethüm A, Hildebrandt J, Habich D, Lehner W (2019) From a comprehensive experimental survey to a cost-based selection strategy for lightweight integer compression algorithms. ACM Trans Database Syst 44(3):9:1–9:46CrossRefGoogle Scholar
  11. 11.
    Faerber F, Kemper A, Larson P, Levandoski JJ, Neumann T, Pavlo A (2017) Main memory database systems. Found Trends Databases 8(1-2):1–130CrossRefGoogle Scholar
  12. 12.
    Habich D, Damme P, Ungethüm A, Lehner W (2018) Make larger vector register sizes new challenges?: Lessons learned from the area of vectorized lightweight compression algorithms. In: DBTest@SIGMOD, pp 8:1–8:6. ACM: New YorkGoogle Scholar
  13. 13.
    Habich D, Damme P, Ungethüm A, Pietrzyk J, Krause A, Hildebrandt J, Lehner W (2019) Morphstore – in-memory query processing based on morphing compressed intermediates LIVE. In: SIGMOD, pp 1917–1920. ACM: New YorkGoogle Scholar
  14. 14.
    He J, Zhang S, He B (2014) In-cache query co-processing on coupled CPU-GPU architectures. PVLDB 8(4):329–340Google Scholar
  15. 15.
    Hildebrandt J, Habich D, Damme P, Lehner W (2016) Compression-aware in-memory query processing: Vision, system design and beyond. In: ADMS, pp 40–56. SpringerGoogle Scholar
  16. 16.
    Idreos S, Groffen F, Nes N, Manegold S, Mullender KS, Kersten ML (2012) Monetdb: two decades of research in column-oriented database architectures. IEEE Data Eng Bull 35(1):40–45Google Scholar
  17. 17.
    Karnagel T, Habich D, Lehner W (2017) Adaptive work placement for query processing on heterogeneous computing resources. PVLDB 10(7):733–744Google Scholar
  18. 18.
    Karnagel T, Müller R, Lohman GM (2015) Optimizing gpu-accelerated group-by and aggregation. In: ADMS, pp 13–24. SpringerGoogle Scholar
  19. 19.
    Kissinger T, Schlegel B, Habich D, Lehner W (2013) QPPT: query processing on prefix trees. In: CIDR. Asilomar, 06.–09.01.2013Google Scholar
  20. 20.
    Komatsu K, Momose S, Isobe Y, Watanabe O, Musa A, Yokokawa M, Aoyama T, Sato M, Kobayashi H (2018) Performance evaluation of a vector supercomputer sx-aurora TSUBASA. In: SC, pp 54:1–54:12. IEEE/ACM: New YorkGoogle Scholar
  21. 21.
    Lang H, Kipf A, Passing L, Boncz PA, Neumann T, Kemper A (2018) Make the most out of your SIMD investments: counter control flow divergence in compiled query pipelines. In: DaMoN@SIGMOD, pp 5:1–5:8. ACM: New YorkGoogle Scholar
  22. 22.
    Lang H, Mühlbauer T, Funke F, Boncz PA, Neumann T, Kemper A (2016) Data blocks: Hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: SIGMOD, pp 311–326. ACM: New YorkCrossRefGoogle Scholar
  23. 23.
    Lee J et al (2014) Joins on encoded and partitioned data. PVLDB 7(13):1355–1366Google Scholar
  24. 24.
    Lemire D, Boytsov L (2015) Decoding billions of integers per second through vectorization. Softw Pract Exper 45(1):1–29CrossRefGoogle Scholar
  25. 25.
    Li F, Das S, Syamala M, Narasayya VR (2016) Accelerating relational databases by leveraging remote memory and RDMA. In: SIGMOD, pp 355–370. ACM: New YorkCrossRefGoogle Scholar
  26. 26.
    Li Y, Patel JM (2013) Bitweaving: Fast scans for main memory data processing. In: SIGMOD, pp 289–300. ACM: New YorkGoogle Scholar
  27. 27.
    Lisa NJ, Ungethüm A, Habich D, Lehner W, Nguyen TDA, Kumar A (2018) Column scan acceleration in hybrid CPU-FPGA systems. In: ADMS@VLDB, pp 22–33. Rio de Janeiro, 27.08.2018Google Scholar
  28. 28.
    Oukid I, Booss D, Lespinasse A, Lehner W, Willhalm T, Gomes G (2017) Memory management techniques for large-scale persistent-main-memory systems. PVLDB 10(11):1166–1177Google Scholar
  29. 29.
    Pietrzyk J, Habich D, Damme P, Lehner W (2019) First investigations of the vector supercomputer sx-aurora TSUBASA as a co-processor for database systems. In: BTW Workshopband, pp 33–50. GI: BonnGoogle Scholar
  30. 30.
    Pietrzyk J, Ungethüm A, Habich D, Lehner W (2019) Fighting the duplicates in hashing: conflict detection-aware vectorization of linear probing. In: BTW, pp 35–53. GI: BonnGoogle Scholar
  31. 31.
    Pirk H, Moll O, Zaharia M, Madden S (2016) Voodoo – A vector algebra for portable database performance on modern hardware. PVLDB 9(14):1707–1718Google Scholar
  32. 32.
    Polychroniou O, Raghavan A, Ross KA (2015) Rethinking SIMD vectorization for in-memory databases. In: SIGMOD, pp 1493–1508. ACM: New YorkGoogle Scholar
  33. 33.
    Stonebraker M, Abadi DJ, Batkin A, Chen X, Cherniack M, Ferreira M, Lau E, Lin A, Madden S, O’Neil EJ, O’Neil PE, Rasin A, Tran N, Zdonik SB (2005) C‑store: a column-oriented DBMS. In: VLDB, pp 553–564. ACM: New YorkGoogle Scholar
  34. 34.
    Ungethüm A, Pietrzyk J, Damme P, Habich D, Lehner W (2018) Conflict detection-based run-length encoding – AVX-512 CD instruction set in action. In: ICDE Workshops, pp 96–101. IEEE Computer Society: Washington D.C.Google Scholar
  35. 35.
    Zukowski M, Héman S, Nes N, Boncz PA (2006) Super-scalar RAM-CPU cache compression. In: ICDE, p 59. IEEE Computer Society: Washington D.C.Google Scholar
  36. 36.
    Zukowski M, van de Wiel M, Boncz PA (2012) Vectorwise: a vectorized analytical DBMS. In: ICDE, pp 1349–1350. IEEE Computer Society: Washington D.C.Google Scholar

Copyright information

© Gesellschaft für Informatik e.V. and Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.TU Dresden – Professur für DatenbankenDresdenGermany
  2. 2.NEC HPC Europe GmbHStuttgartGermany

Personalised recommendations