Technology Conference on Performance Evaluation and Benchmarking

TPCTC 2014: Performance Characterization and Benchmarking. Traditional to Big Data pp 44-63

Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data

  • Chaitanya Baru
  • Milind Bhandarkar
  • Carlo Curino
  • Manuel Danisch
  • Michael Frank
  • Bhaskar Gowda
  • Hans-Arno Jacobsen
  • Huang Jie
  • Dileep Kumar
  • Raghunath Nambiar
  • Meikel Poess
  • Francois Raab
  • Tilmann Rabl
  • Nishkam Ravi
  • Kai Sachs
  • Saptak Sen
  • Lan Yi
  • Choonhan Youn
Conference paper

DOI: 10.1007/978-3-319-15350-6_4

Volume 8904 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Baru C. et al. (2015) Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data. In: Nambiar R., Poess M. (eds) Performance Characterization and Benchmarking. Traditional to Big Data. TPCTC 2014. Lecture Notes in Computer Science, vol 8904. Springer, Cham

Abstract

Enterprises perceive a huge opportunity in mining information that can be found in big data. New storage systems and processing paradigms are allowing for ever larger data sets to be collected and analyzed. The high demand for data analytics and rapid development in technologies has led to a sizable ecosystem of big data processing systems. However, the lack of established, standardized benchmarks makes it difficult for users to choose the appropriate systems that suit their requirements. To address this problem, we have developed the BigBench benchmark specification. BigBench is the first end-to-end big data analytics benchmark suite. In this paper, we present the BigBench benchmark and analyze the workload from technical as well as business point of view. We characterize the queries in the workload along different dimensions, according to their functional characteristics, and also analyze their runtime behavior. Finally, we evaluate the suitability and relevance of the workload from the point of view of enterprise applications, and discuss potential extensions to the proposed specification in order to cover typical big data processing use cases.

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Chaitanya Baru
    • 11
  • Milind Bhandarkar
    • 10
  • Carlo Curino
    • 7
  • Manuel Danisch
    • 1
  • Michael Frank
    • 1
  • Bhaskar Gowda
    • 6
  • Hans-Arno Jacobsen
    • 8
  • Huang Jie
    • 6
  • Dileep Kumar
    • 3
  • Raghunath Nambiar
    • 2
  • Meikel Poess
    • 9
  • Francois Raab
    • 5
  • Tilmann Rabl
    • 1
    • 8
  • Nishkam Ravi
    • 3
  • Kai Sachs
    • 12
  • Saptak Sen
    • 4
  • Lan Yi
    • 6
  • Choonhan Youn
    • 11
  1. 1.BankmarkPassauGermany
  2. 2.Cisco SystemsSan JoseUSA
  3. 3.ClouderaPalo AltoUSA
  4. 4.HortonworksSanta ClaraUSA
  5. 5.InfosizingManitou SpringsUSA
  6. 6.Intel CorporationSanta ClaraUSA
  7. 7.Microsoft CorporationRedmondUSA
  8. 8.Middleware Systems Research GroupTorontoCanada
  9. 9.Oracle CorporationRedwood CityUSA
  10. 10.PivotalVancouverCanada
  11. 11.San Diego Supercomputer CenterLa JollaUSA
  12. 12.SPEC Research GroupGainesvilleUSA