Abstract
Large-scale TPC workloads are critical for the evaluation of datacenter-scale storage systems. However, these workloads have not been previously characterized, in-depth, and modeled in a DC environment. In this work, we categorize the TPC workloads into storage threads that have unique features and characterize the storage activity of TPCC, TPCE and TPCH based on I/O traces from real server installations. We also propose a framework for modeling and generation of large-scale TPC workloads, which allows us to conduct a wide spectrum of storage experiments without requiring knowledge on the structure of the application or the overhead of fully deploying it in different storage configurations. Using our framework, we eliminate the time for TPC setup and reduce the time for experiments by two orders of magnitude, due to the compression in storage activity enforced by the model. We demonstrate the accuracy of the model and the applicability of our method to significant datacenter storage challenges, including identification of early disk errors, and SSD caching.
Keywords
- Workload
- Modeling
- Storage Traces
- TPC benchmarks
- Characterization
- Storage Configuration
- Datacenter
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sankar, S., Vaid, K.: Storage characterization for unstructured data in online services applications. In: Proceedings of the IEEE International Symposium on Workload Characterization, IISWC, Austin, TX (2009)
Kavalanekar, S., Worthington, B., Zha, Q., Sharda, V.: Characterization of storage workload traces from production Windows servers. In: Proceedings of IEEE International Symposium on Workload Characterization, IISWC 2008, Seattle, WA (September 2008)
Delimitrou, C., Sankar, S., Vaid, K., Kozyrakis, C.: Accurate Modeling and Generation of Storage I/O for Datacenter Workloads. In: Proceedings of the 2nd Workshop on Exascale Evaluation and Research Techniques, EXERT, Newport Beach, CA (March 2011)
Sankar, S., Vaid, K.: Addressing the stranded power problem in datacenters using storage workload characterization. In: Proceedings of the First WOSP/SIPEW International Conference on Performance Engineering, San Jose, CA (2010)
DiskSpd: File and Network I/O using Win32 and .NET API’s on Windows XP, http://research.microsoft.com/en-us/um/siliconvalley/projects/sequentialio/
IOMeter, performance analysis tool, http://www.iometer.org/
Kozyrakis, C., Kansal, A., Sankar, S., Vaid, K.: Server Engineering Insights for Large-Scale Online Services. IEEE Micro 30(4) (July 2010)
Kavalanekar, S., Narayanan, D., Sankar, S., Thereska, E., Vaid, K., Worthington, B.: Measuring Database Performance in Online Services: A Trace-Based Approach. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 132–145. Springer, Heidelberg (2009)
TPC BENCHMARK-H (Decision Support). Standard Specification. Revision 2.14.0. TPC Council, San Francisco (2011)
SQLIO Disk Subsystem Benchmark Tool, http://www.microsoft.com/downloads/en/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb53442d9e19&displaylang=en
Vandenbergh, H.: Vdbench: User Guide. Version: 5.00 (October 2008) http://garr.dl.sourceforge.net/project/vdbench/vdbench/Vdbench%205.00/vdbench.pdf
Adaptec MaxIQ. 32GB SSD Cache Performance Kit, http://www.adaptec.com/en-US/products/CloudComputing/MaxIQ/SSD-Cache-Performance/
Zhang, J., Sivasubramaniam, A., Franke, H., Gautham, N., Zhang, Y., Nagar, S.: Synthesizing Representative I/O Workloads for TPC-H. In: Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), Madrid, Spain (February 2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Delimitrou, C., Sankar, S., Khessib, B., Vaid, K., Kozyrakis, C. (2012). Time and Cost-Efficient Modeling and Generation of Large-Scale TPCC/TPCE/TPCH Workloads. In: Nambiar, R., Poess, M. (eds) Topics in Performance Evaluation, Measurement and Characterization. TPCTC 2011. Lecture Notes in Computer Science, vol 7144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32627-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-32627-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32626-4
Online ISBN: 978-3-642-32627-1
eBook Packages: Computer ScienceComputer Science (R0)
