JCC-H: Adding Join Crossing Correlations with Skew to TPC-H

  • Peter BonczEmail author
  • Angelos-Christos Anatiotis
  • Steffen Kläbe
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10661)


We introduce JCC-H, a drop-in replacement for the data and query generator of TPC-H, that introduces Join-Crossing-Correlations (JCC) and skew into its dataset and query workload. These correlations are carefully designed such that the filter predicates on table columns in the existing TPC-H queries now suddenly can have effects on the value-, frequency- and join-fan-out-distributions, experienced by operators in the query plan. The query generator of JCC-H is able to generate parameter bindings for the 22 query templates in two different equivalence classes: query templates that receive “normal” parameters do not experience skew and behave very similar to default TPC-H queries. Query templates expanded with the “skewed” parameters, though, experience strong join-crossing-correlations and skew in filter, aggregation and join operations. In this paper we discuss the goals of JCC-H, its detailed design, as well as show initial experiments on both a single-server and MPP database system, that confirm that our design goals were largely met. In all, JCC-H provides a convenient way for any system that is already testing with TPC-H to examine how the system can handle skew and correlations, so we hope the community can use it to make progress on issues like skew mitigation and detection and exploitation of join-crossing-correlations in query optimizers and data storage.



This paper is a result of the “Parallelism and Skew” working group at Dagstuhl seminar 17222 (Robust Performance in Database Query Processing). We would like to thank group members Johann-Christoph Freytag (HU Berlin), Alfons Kemper (TU Munich), Glenn Paulley (SAP Canada) and Kai-Uwe Sattler (TU Ilmenau) for their contributions. The research of A.C. Anadiotis was partially funded by the Swiss National Science Foundation, Project No.: 200021_146407/1 (FN–X–Core).


  1. 1.
    Boncz, P., Neumann, T., Erling, O.: TPC-H analyzed: hidden messages and lessons learned from an influential benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2013. LNCS, vol. 8391, pp. 61–76. Springer, Cham (2014). CrossRefGoogle Scholar
  2. 2.
    Crolotte, A., Ghazal, A.: Introducing skew into the TPC-H benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2011. LNCS, vol. 7144, pp. 137–145. Springer, Heidelberg (2012). CrossRefGoogle Scholar
  3. 3.
    Erling, O., Averbuch, A., Larriba-Pey, J., Chafi, H., Gubichev, A., Prat, A., Pham, M.-D., Boncz, P.: The LDBC social network benchmark interactive workload. In: SIGMOD (2015)Google Scholar
  4. 4.
    Frank, M., Poess, M., Rabl, T.: Efficient update data generation for DBMS benchmarks. In: Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, pp. 169–180 (2012)Google Scholar
  5. 5.
    Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen, H.-A.: BigBench: towards an industry standard benchmark for big data analytics. In: SIGMOD (2013)Google Scholar
  6. 6.
    Gubichev, A., Boncz, P.: Parameter curation for benchmark queries. In: TPCTC, pp. 113–129 (2014)Google Scholar
  7. 7.
    Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., Neumann, T.: How good are query optimizers, really? Proc. VLDB Endowment 9(3), 204–215 (2015)CrossRefGoogle Scholar
  8. 8.
    O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact table indexing. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 237–252. Springer, Heidelberg (2009). Google Scholar
  9. 9.
    Poess, M., Nambiar, R.O., Walrath, D.: Why you should run TPC-DS: a workload analysis. In: VLDB (2007)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Peter Boncz
    • 1
    Email author
  • Angelos-Christos Anatiotis
    • 2
  • Steffen Kläbe
    • 3
  1. 1.CWIAmsterdamNetherlands
  2. 2.EPFLLausanneSwitzerland
  3. 3.TU IlmenauIlmenauGermany

Personalised recommendations