Characterizing and optimizing TPC-C workloads on large-scale systems using SSD arrays

  • Jidong Zhai
  • Feng Zhang
  • Qingwen Li
  • Wenguang Chen
  • Weimin Zheng
Research Paper

Abstract

Transaction processing performance council benchmark C (TPC-C) is the de facto standard for evaluating the performance of high-end computers running on-line transaction processing applications. Differing from other standard benchmarks, the transaction processing performance council only defines specifications for the TPC-C benchmark, but does not provide any standard implementation for end-users. Due to the complexity of the TPC-C workload, it is a challenging task to obtain optimal performance for TPC-C evaluation on a large-scale high-end computer. In this paper, we designed and implemented a large-scale TPC-C evaluation system based on the latest TPC-C specification using solid-state drive (SSD) storage devices. By analyzing the characteristics of the TPC-C workload, we propose a series of system-level optimization methods to improve the TPC-C performance. First, we propose an approach based on SmallFile table space to organize the test data in a round-robin method on all of the disk array partitions; this can make full use of the underlying disk arrays. Second, we propose using a NOOP-based disk scheduling algorithm to reduce the utilization rate of processors and improve the average input/output service time. Third, to improve the system translation lookaside buffer hit rate and reduce the processor overhead, we take advantage of the huge page technique to manage a large amount of memory resources. Lastly, we propose a locality-aware interrupt mapping strategy based on the asymmetry characteristic of non-uniform memory access systems to improve the system performance. Using these optimization methods, we performed the TPC-C test on two large-scale high-end computers using SSD arrays. The experimental results show that our methods can effectively improve the TPC-C performance. For example, the performance of the TPC-C test on an Intel Westmere server reached 1.018 million transactions per minute.

Keywords

TPC-C benchmark OLTP workload performance analysis performance optimization high-end computers 

在大规模系统上优化 TPC-C 评测程序

摘要

创新点

本文提出一系列面向 TPC-C 评测程序的系统级优化方法: 1) 采用小文件表空间模式组织 TPC-C 测试数据, 充分发挥底层磁盘阵列并发处理能力, 平衡上层 I/O 请求; 2) 采用 NOOP 的磁盘调度策略管理底层的固态磁盘阵列, 降低了平均的 I/O 请求处理时间; 3) 采用大页面方式优化 TPC-C 的内存使用, 提高系统 TLB 命中率并降低处理器开销; 4) 根据中断请求的类型, 把中断绑定到固定的处理器核上, 有效的减少中断请求对处理器性能的干扰。

关键词

TPC-C 固态磁盘阵列 性能分析 在线事务处理 

References

  1. 1.
    Gostin G, Collard J F, Collins K. The architecture of the HP superdome shared-memory multiprocessor. In: Proceedings of the 19th Annual International Conference on Supercomputing, Cambridge, 2005. 239–245Google Scholar
  2. 2.
    Hsu W W, Smith A J, Young H C. Characteristics of production database workloads and the TPC benchmarks. IBM Syst J, 2001, 40: 781–802CrossRefGoogle Scholar
  3. 3.
    Transaction Processing Performance Council. TPC Benchmark C, Standard Specification Version 5.11. http:// www.tpc.org. 2010Google Scholar
  4. 4.
    Henning J L. SPEC CPU2000: measuring CPU performance in the new millennium. Computer, 2000, 33: 28–35CrossRefGoogle Scholar
  5. 5.
    Transaction Processing Performance Council. TPC-C results by performance. http://www.tpc.org. 2013Google Scholar
  6. 6.
    Shaw S. HammerDB Installation and Troubleshooting Guide Version 2.5, 2013Google Scholar
  7. 7.
    Llanos D R. TPCC-UVa: An open-source TPC-C implementation for global performance measurement of computer systems. ACM SIGMOD Record, 2006, 35: 6–15CrossRefGoogle Scholar
  8. 8.
    Wong M, Meredith M E. Open Source Development Labs Database Test 2 User Guide Version 0.21, 2002Google Scholar
  9. 9.
    Quest Software Inc. Benchmark Factory for Databases User Guide Version 6.9.2, 2013Google Scholar
  10. 10.
    Clark M. Installation and Configuration Guide for Orabm and Orastress Version 2.1, 2006Google Scholar
  11. 11.
    PostgreSQL Global Development Group. PostgreSQL 7.1 Reference Manual, 2001Google Scholar
  12. 12.
    Delimitrou C, Sankar S, Khessib B, et al. Time and cost-efficient modeling and generation of large-scale TPCC/TPCE/TPCH workloads. Topics in Performance Evaluation, Measurement and Characterization. Berlin: Springer, 2012, 7144: 146–162CrossRefGoogle Scholar
  13. 13.
    Barham P, Donnelly A, Isaacs R, et al. Using magpie for request extraction and workload modelling. In: Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI), San Francisco, 2004, 4: 18Google Scholar
  14. 14.
    Kim S H, Jung D, Kim J S, et al. HeteroDrive: reshaping the storage access pattern of OLTP workload using SSD. In: Proceedings of 4th International Workshop on Software Support for Portable Storage (IWSSPS), Seoul, 2009. 13–17Google Scholar
  15. 15.
    Chen S, Ailamaki A, Athanassoulis M, et al. TPC-E vs. TPC-C: characterizing the new TPC-E benchmark via an I/O comparison study. ACM SIGMOD Record, 2011, 39, 5–10Google Scholar
  16. 16.
    Ash S M, Lin K I. Optimizing database index performance for solid state drives. In: Proceedings of the 18th International Database Engineering Applications Symposium. New York: ACM, 2014. 237–246Google Scholar
  17. 17.
    Yao J H, Ng A, Chen S P, et al. A performance evaluation of public cloud using TPC-C. In: Service-Oriented Computing-ICSOC Workshops. Berlin: Springer, 2013, 7759: 3–13Google Scholar
  18. 18.
    Zhai J, Chen W, Zheng W. Phantom: predicting performance of parallel applications on large-scale parallel machines using a single node. ACM Sigplan Notices. 2010, 45: 305–314CrossRefGoogle Scholar
  19. 19.
    Chen Y P, Raab F, Katz R. From TPC-C to big data benchmarks: a functional workload model. Specifying Big Data Benchmarks. Berlin: Springer, 2014, 8163: 28–43CrossRefGoogle Scholar
  20. 20.
    Tozun P, Pandis I, Kaynak C, et al. From A to E: analyzing TPC’s OLTP benchmarks: the obsolete, the ubiquitous, the unexplored. In: Proceedings of the 16th International Conference on Extending Database Technology, Genoa, 2013. 17–28Google Scholar
  21. 21.
    Teigland D, Mauelshagen H. Volume Managers in Linux. In: Proceedings of USENIX Annual Technical Conference, FREENIX Track, Boston, 2001. 185–198Google Scholar
  22. 22.
    Seelam S R, Teller P J. Fairness and performance isolation: an analysis of disk scheduling algorithms. In: IEEE International Conference on Cluster Computing, Barcelona, 2006. 1–10Google Scholar
  23. 23.
    Iyer S, Druschel P. Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O. In: Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP), Banff, 2001. 117–130Google Scholar
  24. 24.
    Raina S. Virtual Shared Memory: A Survey of Techniques and Systems. University of Bristol Technical Report, 1992Google Scholar
  25. 25.
    Zaharia M, Chowdhury M, Franklin M J, et al. Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, Berkeley, 2010. 10Google Scholar

Copyright information

© Science China Press and Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Jidong Zhai
    • 1
  • Feng Zhang
    • 1
  • Qingwen Li
    • 1
  • Wenguang Chen
    • 1
  • Weimin Zheng
    • 1
  1. 1.Department of Computer Science and TechnologyTsinghua UniversityBeijingChina

Personalised recommendations