Skip to main content

Experience from Hadoop Benchmarking with HiBench: From Micro-Benchmarks Toward End-to-End Pipelines

  • Conference paper
  • First Online:
Book cover Advancing Big Data Benchmarks (WBDB 2013, WBDB 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8585))

Included in the following conference series:

Abstract

As Hadoop-based big data framework grows in pervasiveness and scale, realistically benchmarking Hadoop systems becomes critically important to the Hadoop community and industry. In this paper, we present our experience of Hadoop benchmarking with HiBench (an open source Hadoop benchmark suite widely used by Hadoop users), and introduce our recent work on advanced end-to-end ETL-recommendation pipelines based on our experience.

Jinquan Dai: This work was done when the author was working in Intel Asia-Pacific Research and Development Ltd.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: ICDEW, Hibench, March 2010

    Google Scholar 

  2. HiBench Homepage. https://github.com/intel-hadoop/HiBench

  3. Nutch homepage. http://lucene.apache.org/nutch/

  4. Pegasus Homepage. http://pegasus.isi.edu/

  5. A Benchmark for Hive, PIG and Hadoop. http://issues.apache.org/jira/browse/HIVE-396

  6. Pavlo, A., Rasin, A., Madden, S., Stonebraker, M., DeWitt, D., Paulson, E., Shrinivas, L., Abadi, D.J.: A comparison of approaches to large-scale data analysis. In: SIGMOD, June 2009

    Google Scholar 

  7. GridMix3. http://hadoop.apache.org/mapreduce/docs/current/gridmix.html

  8. Chen, Y., Ganapathi, A., Griffith, R., Katz. R.: The case for evaluating MapReduce performance using workload suites. In: MASCOTS (2011)

    Google Scholar 

  9. TPC Benchmark DS (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lan Yi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Yi, L., Dai, J. (2014). Experience from Hadoop Benchmarking with HiBench: From Micro-Benchmarks Toward End-to-End Pipelines. In: Rabl, T., Raghunath, N., Poess, M., Bhandarkar, M., Jacobsen, HA., Baru, C. (eds) Advancing Big Data Benchmarks. WBDB WBDB 2013 2013. Lecture Notes in Computer Science(), vol 8585. Springer, Cham. https://doi.org/10.1007/978-3-319-10596-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10596-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10595-6

  • Online ISBN: 978-3-319-10596-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics