Skip to main content

Setting the Direction for Big Data Benchmark Standards

  • Conference paper
Selected Topics in Performance Evaluation and Benchmarking (TPCTC 2012)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7755))

Included in the following conference series:

Abstract

The Workshop on Big Data Benchmarking (WBDB2012), held on May 8-9, 2012 in San Jose, CA, served as an incubator for several promising approaches to define a big data benchmark standard for industry. Through an open forum for discussions on a number of issues related to big data benchmarking—including definitions of big data terms, benchmark processes and auditing — the attendees were able to extend their own view of big data benchmarking as well as communicate their own ideas, which ultimately led to the formation of small working groups to continue collaborative work in this area. In this paper, we summarize the discussions and outcomes from this first workshop, which was attended by about 60 invitees representing 45 different organizations, including industry and academia. Workshop attendees were selected based on their experience and expertise in the areas of management of big data, database systems, performance benchmarking, and big data applications. There was consensus among participants about both the need and the opportunity for defining benchmarks to capture the end-to-end aspects of big data applications. Following the model of TPC benchmarks, it was felt that big data benchmarks should not only include metrics for performance, but also price/performance, along with a sound foundation for fair comparison through audit mechanisms. Additionally, the benchmarks should consider several costs relevant to big data systems including total cost of acquisition, setup cost, and the total cost of ownership, including energy cost. The second Workshop on Big Data Benchmarking will be held in December 2012 in Pune, India, and the third meeting is being planned for July 2013 in Xi’an, China.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gridmix3, git://git.apache.org/hadoop-mapreduce.git/src/contrib/gridmix/

    Google Scholar 

  2. Internet World Stats – Miniwatts Marketing Group (December 2011), http://www.internetworldstats.com/stats.html

  3. SPEC CPU2006: http://www.spec.org/cpu2006/

  4. Statistical Workload Injector for MapReduce (SWIM), https://github.com/SWIMProjectUCB/SWIM/wiki

  5. TPC: TPC Benchmark DS Specification, http://www.tpc.org/tpcds/spec/tpcds_1.1.0.pdf

  6. TPC: TPC-Pricing Specification, http://www.tpc.org/pricing/spec/Price_V1.7.0.pdf

  7. Workshop On Big Data Benchmarking (2012), http://clds.ucsd.edu/wbdb2012

  8. Agrawal, D., Bernstein, P., Bertino, E., Davidson, S., Dayal, U., Franklin, M., Gehrke, J., Haas, L., Halevy, A., Han, J., Jagadish, H.V., Labrinidis, A., Madden, S., Papakonstantinou, Y., Patel, J., Ramakrishnan, R., Ross, K., Shahabi, C., Suciu, D., Vaithyanathan, S., Widom, J.: Challenges and Opportunities with Big Data. Community white paper (2011)

    Google Scholar 

  9. Gantz, J., Reinsel, D.: The Digital Universe Decade – Are You Ready? IDC report (2010), http://www.emc.com/collateral/analyst-reports/idc-digital-universe-are-you-ready.pdf

  10. Gray, J.: Sort Benchmark Home Page, http://sortbenchmark.org/

  11. Hogan, T.: Overview of TPC Benchmark E: The Next Generation of OLTP Benchmarks. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 84–98. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. Huppler, K.: Price and the TPC. In: Nambiar, R., Poess, M. (eds.) TPCTC 2010. LNCS, vol. 6417, pp. 73–84. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  13. Murphy, R.C., Wheeler, K.B., Barrett, B.W., Ang, J.A.: Introducing the Graph 500. Sandia National Laboratories (2010)

    Google Scholar 

  14. Nambiar, R., Poess, M.: The Making of TPC-DS. In: VLDB 2006, pp. 1049-1058, (2006)

    Google Scholar 

  15. Patil, S., Polte, M., Ren, K., Tantisiriroj, W., Xiao, L., López, J., Gibson, G., Fuchs, A., Rinaldi, B.: YCSB++: Benchmarking and Performance Debugging Advanced Features in Scalable Table Stores. In: SOCC 2011, pp. 9:1-9:14 (2011)

    Google Scholar 

  16. Poess, M., Floyd, C.: New TPC Benchmarks for Decision Support and Web Commerce. SIGMOD Record 29(4), 64–71 (2000)

    Article  Google Scholar 

  17. Poess, M., Nambiar, R., Walrath, D.: Why You Should Run TPC-DS: A Workload Analysis. In: VLDB 2007, pp. 1138–1149 (2007)

    Google Scholar 

  18. Poess, M., Smith, B., Kollár, L., Larson, P.: TPC-DS, Taking Decision Support Benchmarking to the Next Level. In: SIGMOD 2002, pp. 582–587 (2002)

    Google Scholar 

  19. Rabl, T., Frank, M., Sergieh, H.M., Kosch, H.: A Data Generator for Cloud-Scale Benchmarking. In: Nambiar, R., Poess, M. (eds.) TPCTC 2010. LNCS, vol. 6417, pp. 41–56. Springer, Heidelberg (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Baru, C., Bhandarkar, M., Nambiar, R., Poess, M., Rabl, T. (2013). Setting the Direction for Big Data Benchmark Standards. In: Nambiar, R., Poess, M. (eds) Selected Topics in Performance Evaluation and Benchmarking. TPCTC 2012. Lecture Notes in Computer Science, vol 7755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36727-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36727-4_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36726-7

  • Online ISBN: 978-3-642-36727-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics