Skip to main content

Time and Cost-Efficient Modeling and Generation of Large-Scale TPCC/TPCE/TPCH Workloads

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNPSE,volume 7144)

Abstract

Large-scale TPC workloads are critical for the evaluation of datacenter-scale storage systems. However, these workloads have not been previously characterized, in-depth, and modeled in a DC environment. In this work, we categorize the TPC workloads into storage threads that have unique features and characterize the storage activity of TPCC, TPCE and TPCH based on I/O traces from real server installations. We also propose a framework for modeling and generation of large-scale TPC workloads, which allows us to conduct a wide spectrum of storage experiments without requiring knowledge on the structure of the application or the overhead of fully deploying it in different storage configurations. Using our framework, we eliminate the time for TPC setup and reduce the time for experiments by two orders of magnitude, due to the compression in storage activity enforced by the model. We demonstrate the accuracy of the model and the applicability of our method to significant datacenter storage challenges, including identification of early disk errors, and SSD caching.

Keywords

  • Workload
  • Modeling
  • Storage Traces
  • TPC benchmarks
  • Characterization
  • Storage Configuration
  • Datacenter

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   72.00
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sankar, S., Vaid, K.: Storage characterization for unstructured data in online services applications. In: Proceedings of the IEEE International Symposium on Workload Characterization, IISWC, Austin, TX (2009)

    Google Scholar 

  2. Kavalanekar, S., Worthington, B., Zha, Q., Sharda, V.: Characterization of storage workload traces from production Windows servers. In: Proceedings of IEEE International Symposium on Workload Characterization, IISWC 2008, Seattle, WA (September 2008)

    Google Scholar 

  3. Delimitrou, C., Sankar, S., Vaid, K., Kozyrakis, C.: Accurate Modeling and Generation of Storage I/O for Datacenter Workloads. In: Proceedings of the 2nd Workshop on Exascale Evaluation and Research Techniques, EXERT, Newport Beach, CA (March 2011)

    Google Scholar 

  4. Sankar, S., Vaid, K.: Addressing the stranded power problem in datacenters using storage workload characterization. In: Proceedings of the First WOSP/SIPEW International Conference on Performance Engineering, San Jose, CA (2010)

    Google Scholar 

  5. DiskSpd: File and Network I/O using Win32 and .NET API’s on Windows XP, http://research.microsoft.com/en-us/um/siliconvalley/projects/sequentialio/

  6. IOMeter, performance analysis tool, http://www.iometer.org/

  7. Kozyrakis, C., Kansal, A., Sankar, S., Vaid, K.: Server Engineering Insights for Large-Scale Online Services. IEEE Micro 30(4) (July 2010)

    Google Scholar 

  8. Kavalanekar, S., Narayanan, D., Sankar, S., Thereska, E., Vaid, K., Worthington, B.: Measuring Database Performance in Online Services: A Trace-Based Approach. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 132–145. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  9. TPC BENCHMARK-H (Decision Support). Standard Specification. Revision 2.14.0. TPC Council, San Francisco (2011)

    Google Scholar 

  10. SQLIO Disk Subsystem Benchmark Tool, http://www.microsoft.com/downloads/en/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb53442d9e19&displaylang=en

  11. Vandenbergh, H.: Vdbench: User Guide. Version: 5.00 (October 2008) http://garr.dl.sourceforge.net/project/vdbench/vdbench/Vdbench%205.00/vdbench.pdf

  12. Adaptec MaxIQ. 32GB SSD Cache Performance Kit, http://www.adaptec.com/en-US/products/CloudComputing/MaxIQ/SSD-Cache-Performance/

  13. Zhang, J., Sivasubramaniam, A., Franke, H., Gautham, N., Zhang, Y., Nagar, S.: Synthesizing Representative I/O Workloads for TPC-H. In: Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), Madrid, Spain (February 2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Delimitrou, C., Sankar, S., Khessib, B., Vaid, K., Kozyrakis, C. (2012). Time and Cost-Efficient Modeling and Generation of Large-Scale TPCC/TPCE/TPCH Workloads. In: Nambiar, R., Poess, M. (eds) Topics in Performance Evaluation, Measurement and Characterization. TPCTC 2011. Lecture Notes in Computer Science, vol 7144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32627-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32627-1_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32626-4

  • Online ISBN: 978-3-642-32627-1

  • eBook Packages: Computer ScienceComputer Science (R0)