Skip to main content

TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark

  • Conference paper
Performance Characterization and Benchmarking (TPCTC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8391))

Included in the following conference series:

Abstract

The TPC-D benchmark was developed almost 20 years ago, and even though its current existence as TPC-H could be considered superseded by TPC-DS, one can still learn from it. We focus on the technical level, summarizing the challenges posed by the TPC-H workload as we now understand them, which we call “choke points”. We identify 28 different such choke points, grouped into six categories: Aggregation Performance, Join Performance, Data Access Locality, Expression Calculation, Correlated Subqueries and Parallel Execution. On the meta-level, we make the point that the rich set of choke-points found in TPC-H sets an example on how to design future DBMS benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Huppler, K.: The art of building a good benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 18–30. Springer, Heidelberg (2009)

    Google Scholar 

  2. Nambiar, R.O., Poess, M.: The making of TPC-DS. In: VLDB, pp. 1049–1058 (2006)

    Google Scholar 

  3. Simmen, D.E., Shekita, E.J., Malkemus, T.: Fundamental techniques for order optimization. In: Jagadish, H.V., Mumick, I.S. (eds.) Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, pp. 57–67. ACM Press (1996)

    Google Scholar 

  4. Moerkotte, G., Neumann, T.: Accelerating queries with group-by and join by groupjoin. PVLDB 4, 843–851 (2011)

    Google Scholar 

  5. Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25, 73–170 (1993)

    Article  Google Scholar 

  6. Neumann, T., Weikum, G.: Scalable join processing on very large rdf graphs. In: Proceedings of the 35th SIGMOD International Conference on Management of Data, pp. 627–640. ACM (2009)

    Google Scholar 

  7. Rao, J., Lindsay, B., Lohman, G., Pirahesh, H., Simmen, D.: Using EELs: A practical approach to outerjoin and antijoin reordering. In: ICDE, pp. 595–606 (2001)

    Google Scholar 

  8. Moerkotte, G., Neumann, T.: Dynamic programming strikes back. In: SIGMOD Conference, pp. 539–552 (2008)

    Google Scholar 

  9. Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: Cords: Automatic discovery of correlations and soft functional dependencies. In: SIGMOD Conference, pp. 647–658 (2004)

    Google Scholar 

  10. Moerkotte, G.: Small materialized aggregates: A light weight index structure for data warehousing. In: VLDB, pp. 476–487 (1998)

    Google Scholar 

  11. Zukowski, M., Nes, N., Boncz, P.A.: DSM vs. NSM: Cpu performance tradeoffs in block-oriented query processing. In: DaMoN, pp. 47–54 (2008)

    Google Scholar 

  12. Abadi, D.J.: Query execution in column-oriented database systems. MIT PhD Dissertation (2008) PhD Thesis

    Google Scholar 

  13. Abadi, D.J., Madden, S., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOD Conference, pp. 967–980 (2008)

    Google Scholar 

  14. Li, Q., Shao, M., Markl, V., Beyer, K.S., Colby, L.S., Lohman, G.M.: Adaptively reordering joins during query execution. In: ICDE, pp. 26–35 (2007)

    Google Scholar 

  15. Seshadri, P., Pirahesh, H., Leung, T.Y.C.: Complex query decorrelation. In: ICDE, pp. 450–458 (1996)

    Google Scholar 

  16. Neumann, T., Moerkotte, G.: A framework for reasoning about share equivalence and its integration into a plan generator. In: BTW, pp. 7–26 (2009)

    Google Scholar 

  17. Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core cpus: Tuning to the underlying hardware. In: ICDE (2013)

    Google Scholar 

  18. Nagel, F., Boncz, P., Viglas, S.D.: Recycling in pipelined query evaluation. In: ICDE (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Boncz, P., Neumann, T., Erling, O. (2014). TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark. In: Nambiar, R., Poess, M. (eds) Performance Characterization and Benchmarking. TPCTC 2013. Lecture Notes in Computer Science, vol 8391. Springer, Cham. https://doi.org/10.1007/978-3-319-04936-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-04936-6_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-04935-9

  • Online ISBN: 978-3-319-04936-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics