Skip to main content

Benchmarking Elastic Query Processing on Big Data

  • Conference paper
  • First Online:
Big Data Benchmarking (WBDB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8991))

Included in the following conference series:

Abstract

Existing analytical query benchmarks, such as TPC-H, often assess database system performance on on-premises hardware installations. On the other hand, some benchmarks for cloud-based analytics deal with flexible infrastructure, but often focus on simpler queries and semi-structured data. With our benchmark draft we attempt to bridge the gap by challenging analytical platforms to answer complex queries on structured business data while leveraging the elastic infrastructure of the cloud to satisfy performance requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. http://oltpbenchmark.com/. Accessed 7 May 2014

  2. Shark. http://shark.cs.berkeley.edu

  3. Abouzeid, A., Bajda-Pawlikowski, K.: HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. In: Proceedings of the VLDB Endowment (2009)

    Google Scholar 

  4. Baru, C., Bhandarkar, M., Nambiar, R.: Setting the direction for big data benchmark standards. In: Selected Topics in Performance Evaluation and Benchmarking, pp. 1–13 (2013)

    Google Scholar 

  5. Chen, Y., Raab, F., Katz, R.: From tpc-c to big data benchmarks: a functional workload model. In: Rabl, T., Poess, M., Baru, C., Jacobsen, H.-A. (eds.) WBDB 2012. LNCS, vol. 8163, pp. 28–43. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  6. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM symposium on Cloud computing - SoCC 2010, p. 143 (2010)

    Google Scholar 

  7. Dory, T., Mejías, B., Roy, P.V., Tran, N.: Measuring elasticity for cloud databases. In: CLOUD COMPUTING 2011 : The Second International Conference on Cloud Computing, GRIDs, and Virtualization, pp. 154–160 (2011)

    Google Scholar 

  8. Floratou, A., Teletia, N., DeWitt, D.: Can the elephants handle the NoSQL onslaught? Proc. VLDB Endow. 5(12), 1712–1723 (2012)

    Article  Google Scholar 

  9. Foundation, A.: Spark: Lightning-fast cluster computing (2014). http://spark.apache.org/. Accessed 21 March 2014

  10. Gambi, A., Moldovan, D., Copil, G., Truong, H.-L., Dustdar, S.: On estimating actuation delays in elastic computing systems. In: 8th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, pp. 33–42 (2013)

    Google Scholar 

  11. Islam, S., Lee, K., Fekete, A., Liu, A.: How a consumer can measure elasticity for cloud platforms. In: Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering - ICPE 2012, p. 85 (2012)

    Google Scholar 

  12. Jia, Y.: Running the TPC-H Benchmark on Hive. Corresponding issue (2009). https://issues.apache.org/jira/browse/HIVE-600

  13. Kim, K., Jeon, K., Han, H., Kim, S.-G.: Mrbench: a benchmark for mapreduce framework. In: Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2008, pp. 11–18 (2008)

    Google Scholar 

  14. Konstantinou, I., Angelou, E.: On the elasticity of NoSQL databases over cloud management platforms. In: Proceedings of the 20th ACM international conference on Information and Knowledge Management, pp. 2385–2388 (2011)

    Google Scholar 

  15. Meisner, D., Sadler, C.M., Barroso, L.A., Weber, W.-D., Wenisch, T.F.: Power management of online data-intensive services. In: Proceeding of the 38th Annual International Symposium on Computer Architecture - ISCA 2011, p. 319 (2011)

    Google Scholar 

  16. Mühlbauer, T., Rödiger, W., Reiser, A.: ScyPer: elastic OLAP throughput on transactional data. In: Proceedings of the Second Workshop on Data Analytics in the Cloud, pp. 1–5 (2013)

    Google Scholar 

  17. O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact table indexing. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 237–252. Springer, Heidelberg (2009)

    Google Scholar 

  18. Ousterhout, J.K., Agrawal, P., Erickson, D., Kozyrakis, C., Leverich, J., Mazières, D., Mitra, S., Narayanan, A., Parulkar, G.M., Rosenblum, M., Rumble, S.M., Stratmann, E., Stutsman, R.: The case for ramclouds: scalable high-performance storage entirely in dram. Operating Syst. Rev. 43(4), 92–105 (2009)

    Article  Google Scholar 

  19. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: Proceedings of the 35th SIGMOD international conference on Management of data, p. 165 (2009)

    Google Scholar 

  20. Rabl, T., Ghazal, A., Hu, M., Crolotte, A.: Bigbench specification V0. 1. In: Specifying Big Data Benchmarks (2012)

    Google Scholar 

  21. Rödiger, W., Mühlbauer, T., Unterbrunner, P.: Locality-sensitive operators for parallel main-memory database clusters (2014)

    Google Scholar 

  22. Stonebraker, M.: Mapreduce and parallel dbmss: friends or foes? Commun. ACM 53(4), 10 (2010)

    Article  Google Scholar 

  23. Tinnefeld, C., Kossmann, D., Grund, M., Boese, J.-H., Renkes, F., Sikka, V., Plattner, H.: Elastic online analytical processing on ramcloud. In: Guerrini, G., Paton, N.W. (eds.), EDBT, pp. 454–464. ACM (2013)

    Google Scholar 

  24. Transaction Processing Performance Council. TPC-H specification (2010). www.tpc.org/tpch

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitri Vorona .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Vorona, D., Funke, F., Kemper, A., Neumann, T. (2015). Benchmarking Elastic Query Processing on Big Data. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20233-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20232-7

  • Online ISBN: 978-3-319-20233-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics