Advertisement

TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark

  • Peter Boncz
  • Thomas Neumann
  • Orri Erling
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8391)

Abstract

The TPC-D benchmark was developed almost 20 years ago, and even though its current existence as TPC-H could be considered superseded by TPC-DS, one can still learn from it. We focus on the technical level, summarizing the challenges posed by the TPC-H workload as we now understand them, which we call “choke points”. We identify 28 different such choke points, grouped into six categories: Aggregation Performance, Join Performance, Data Access Locality, Expression Calculation, Correlated Subqueries and Parallel Execution. On the meta-level, we make the point that the rich set of choke-points found in TPC-H sets an example on how to design future DBMS benchmarks.

Keywords

Query Execution Query Plan Expression Calculation Workload Management String Comparison 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Huppler, K.: The art of building a good benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 18–30. Springer, Heidelberg (2009)Google Scholar
  2. 2.
    Nambiar, R.O., Poess, M.: The making of TPC-DS. In: VLDB, pp. 1049–1058 (2006)Google Scholar
  3. 3.
    Simmen, D.E., Shekita, E.J., Malkemus, T.: Fundamental techniques for order optimization. In: Jagadish, H.V., Mumick, I.S. (eds.) Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, pp. 57–67. ACM Press (1996)Google Scholar
  4. 4.
    Moerkotte, G., Neumann, T.: Accelerating queries with group-by and join by groupjoin. PVLDB 4, 843–851 (2011)Google Scholar
  5. 5.
    Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25, 73–170 (1993)CrossRefGoogle Scholar
  6. 6.
    Neumann, T., Weikum, G.: Scalable join processing on very large rdf graphs. In: Proceedings of the 35th SIGMOD International Conference on Management of Data, pp. 627–640. ACM (2009)Google Scholar
  7. 7.
    Rao, J., Lindsay, B., Lohman, G., Pirahesh, H., Simmen, D.: Using EELs: A practical approach to outerjoin and antijoin reordering. In: ICDE, pp. 595–606 (2001)Google Scholar
  8. 8.
    Moerkotte, G., Neumann, T.: Dynamic programming strikes back. In: SIGMOD Conference, pp. 539–552 (2008)Google Scholar
  9. 9.
    Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: Cords: Automatic discovery of correlations and soft functional dependencies. In: SIGMOD Conference, pp. 647–658 (2004)Google Scholar
  10. 10.
    Moerkotte, G.: Small materialized aggregates: A light weight index structure for data warehousing. In: VLDB, pp. 476–487 (1998)Google Scholar
  11. 11.
    Zukowski, M., Nes, N., Boncz, P.A.: DSM vs. NSM: Cpu performance tradeoffs in block-oriented query processing. In: DaMoN, pp. 47–54 (2008)Google Scholar
  12. 12.
    Abadi, D.J.: Query execution in column-oriented database systems. MIT PhD Dissertation (2008) PhD ThesisGoogle Scholar
  13. 13.
    Abadi, D.J., Madden, S., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOD Conference, pp. 967–980 (2008)Google Scholar
  14. 14.
    Li, Q., Shao, M., Markl, V., Beyer, K.S., Colby, L.S., Lohman, G.M.: Adaptively reordering joins during query execution. In: ICDE, pp. 26–35 (2007)Google Scholar
  15. 15.
    Seshadri, P., Pirahesh, H., Leung, T.Y.C.: Complex query decorrelation. In: ICDE, pp. 450–458 (1996)Google Scholar
  16. 16.
    Neumann, T., Moerkotte, G.: A framework for reasoning about share equivalence and its integration into a plan generator. In: BTW, pp. 7–26 (2009)Google Scholar
  17. 17.
    Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core cpus: Tuning to the underlying hardware. In: ICDE (2013)Google Scholar
  18. 18.
    Nagel, F., Boncz, P., Viglas, S.D.: Recycling in pipelined query evaluation. In: ICDE (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Peter Boncz
    • 1
  • Thomas Neumann
    • 2
  • Orri Erling
    • 3
  1. 1.CWIAmsterdamThe Netherlands
  2. 2.Technical University MunichGermany
  3. 3.Openlink SoftwareUnited Kingdom

Personalised recommendations