Advertisement

BigBench Specification V0.1

BigBench: An Industry Standard Benchmark for Big Data Analytics
  • Tilmann Rabl
  • Ahmad Ghazal
  • Minqing Hu
  • Alain Crolotte
  • Francois Raab
  • Meikel Poess
  • Hans-Arno Jacobsen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8163)

Abstract

In this article, we present the specification of BigBench, an end-to-end big data benchmark proposal. BigBench models a retail product supplier. The benchmark proposal covers a data model and a set of big data specific queries. BigBench’s synthetic data generator addresses the variety, velocity and volume aspects of big data workloads. The structured part of the BigBench data model is adopted from the TPC-DS benchmark. In addition, the structured schema is enriched with semi-structured and unstructured data components that are common in a retail product supplier environment. This specification contains the full query set as well as the data model.

Keywords

Sentiment Analysis Business Intelligence Online Review Product Review Online Store 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen., H.A.: BigBench: Towards an industry standard benchmark for big data analytics. In: Proceedings of the ACM SIGMOD Conference (2013)Google Scholar
  2. 2.
    Friedman, E., Pawlowski, P., Cieslewicz, J.: SQL/MapReduce: A Practical Approach to Self-Describing, Polymorphic, and Parallelizable User-Defined Functions. PVLDB 2(2), 1402–1413 (2009)Google Scholar
  3. 3.
    Teradata Aster: Teradata Aster Big Analytics Appliance 3H - Analytics Foundation User Guide. Release 5.0.1 edn (2012), http://www.info.teradata.com/edownload.cfm?itemid=123060004
  4. 4.
    Laney, D.: 3D Data Management: Controlling Data Volume, Velocity and Variety. Technical report, Meta Group (2001)Google Scholar
  5. 5.
    Nambiar, R.O., Poess, M.: The Making of TPC-DS. In: VLDB, pp. 1049–1058 (2006)Google Scholar
  6. 6.
    Rabl, T., Frank, M., Sergieh, H.M., Kosch, H.: A Data Generator for Cloud-Scale Benchmarking. In: Nambiar, R., Poess, M. (eds.) TPCTC 2010. LNCS, vol. 6417, pp. 41–56. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: The Next Frontier for Innovation, Competition, and Productivity. Technical report, McKinsey Global Institute (2011), http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/big_data_the_next_frontier_for_innovation

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Tilmann Rabl
    • 1
  • Ahmad Ghazal
    • 2
  • Minqing Hu
    • 2
  • Alain Crolotte
    • 2
  • Francois Raab
    • 3
  • Meikel Poess
    • 4
  • Hans-Arno Jacobsen
    • 1
  1. 1.University of TorontoCanada
  2. 2.Teradata Corp.USA
  3. 3.InfoSizing Inc.USA
  4. 4.Oracle Corp.USA

Personalised recommendations