Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Characterizing and optimizing TPC-C workloads on large-scale systems using SSD arrays

在大规模系统上优化 TPC-C 评测程序

  • 53 Accesses

  • 2 Citations

Abstract

Transaction processing performance council benchmark C (TPC-C) is the de facto standard for evaluating the performance of high-end computers running on-line transaction processing applications. Differing from other standard benchmarks, the transaction processing performance council only defines specifications for the TPC-C benchmark, but does not provide any standard implementation for end-users. Due to the complexity of the TPC-C workload, it is a challenging task to obtain optimal performance for TPC-C evaluation on a large-scale high-end computer. In this paper, we designed and implemented a large-scale TPC-C evaluation system based on the latest TPC-C specification using solid-state drive (SSD) storage devices. By analyzing the characteristics of the TPC-C workload, we propose a series of system-level optimization methods to improve the TPC-C performance. First, we propose an approach based on SmallFile table space to organize the test data in a round-robin method on all of the disk array partitions; this can make full use of the underlying disk arrays. Second, we propose using a NOOP-based disk scheduling algorithm to reduce the utilization rate of processors and improve the average input/output service time. Third, to improve the system translation lookaside buffer hit rate and reduce the processor overhead, we take advantage of the huge page technique to manage a large amount of memory resources. Lastly, we propose a locality-aware interrupt mapping strategy based on the asymmetry characteristic of non-uniform memory access systems to improve the system performance. Using these optimization methods, we performed the TPC-C test on two large-scale high-end computers using SSD arrays. The experimental results show that our methods can effectively improve the TPC-C performance. For example, the performance of the TPC-C test on an Intel Westmere server reached 1.018 million transactions per minute.

摘要

创新点

本文提出一系列面向 TPC-C 评测程序的系统级优化方法: 1) 采用小文件表空间模式组织 TPC-C 测试数据, 充分发挥底层磁盘阵列并发处理能力, 平衡上层 I/O 请求; 2) 采用 NOOP 的磁盘调度策略管理底层的固态磁盘阵列, 降低了平均的 I/O 请求处理时间; 3) 采用大页面方式优化 TPC-C 的内存使用, 提高系统 TLB 命中率并降低处理器开销; 4) 根据中断请求的类型, 把中断绑定到固定的处理器核上, 有效的减少中断请求对处理器性能的干扰。

This is a preview of subscription content, log in to check access.

References

  1. 1

    Gostin G, Collard J F, Collins K. The architecture of the HP superdome shared-memory multiprocessor. In: Proceedings of the 19th Annual International Conference on Supercomputing, Cambridge, 2005. 239–245

  2. 2

    Hsu W W, Smith A J, Young H C. Characteristics of production database workloads and the TPC benchmarks. IBM Syst J, 2001, 40: 781–802

  3. 3

    Transaction Processing Performance Council. TPC Benchmark C, Standard Specification Version 5.11. http:// www.tpc.org. 2010

  4. 4

    Henning J L. SPEC CPU2000: measuring CPU performance in the new millennium. Computer, 2000, 33: 28–35

  5. 5

    Transaction Processing Performance Council. TPC-C results by performance. http://www.tpc.org. 2013

  6. 6

    Shaw S. HammerDB Installation and Troubleshooting Guide Version 2.5, 2013

  7. 7

    Llanos D R. TPCC-UVa: An open-source TPC-C implementation for global performance measurement of computer systems. ACM SIGMOD Record, 2006, 35: 6–15

  8. 8

    Wong M, Meredith M E. Open Source Development Labs Database Test 2 User Guide Version 0.21, 2002

  9. 9

    Quest Software Inc. Benchmark Factory for Databases User Guide Version 6.9.2, 2013

  10. 10

    Clark M. Installation and Configuration Guide for Orabm and Orastress Version 2.1, 2006

  11. 11

    PostgreSQL Global Development Group. PostgreSQL 7.1 Reference Manual, 2001

  12. 12

    Delimitrou C, Sankar S, Khessib B, et al. Time and cost-efficient modeling and generation of large-scale TPCC/TPCE/TPCH workloads. Topics in Performance Evaluation, Measurement and Characterization. Berlin: Springer, 2012, 7144: 146–162

  13. 13

    Barham P, Donnelly A, Isaacs R, et al. Using magpie for request extraction and workload modelling. In: Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI), San Francisco, 2004, 4: 18

  14. 14

    Kim S H, Jung D, Kim J S, et al. HeteroDrive: reshaping the storage access pattern of OLTP workload using SSD. In: Proceedings of 4th International Workshop on Software Support for Portable Storage (IWSSPS), Seoul, 2009. 13–17

  15. 15

    Chen S, Ailamaki A, Athanassoulis M, et al. TPC-E vs. TPC-C: characterizing the new TPC-E benchmark via an I/O comparison study. ACM SIGMOD Record, 2011, 39, 5–10

  16. 16

    Ash S M, Lin K I. Optimizing database index performance for solid state drives. In: Proceedings of the 18th International Database Engineering Applications Symposium. New York: ACM, 2014. 237–246

  17. 17

    Yao J H, Ng A, Chen S P, et al. A performance evaluation of public cloud using TPC-C. In: Service-Oriented Computing-ICSOC Workshops. Berlin: Springer, 2013, 7759: 3–13

  18. 18

    Zhai J, Chen W, Zheng W. Phantom: predicting performance of parallel applications on large-scale parallel machines using a single node. ACM Sigplan Notices. 2010, 45: 305–314

  19. 19

    Chen Y P, Raab F, Katz R. From TPC-C to big data benchmarks: a functional workload model. Specifying Big Data Benchmarks. Berlin: Springer, 2014, 8163: 28–43

  20. 20

    Tozun P, Pandis I, Kaynak C, et al. From A to E: analyzing TPC’s OLTP benchmarks: the obsolete, the ubiquitous, the unexplored. In: Proceedings of the 16th International Conference on Extending Database Technology, Genoa, 2013. 17–28

  21. 21

    Teigland D, Mauelshagen H. Volume Managers in Linux. In: Proceedings of USENIX Annual Technical Conference, FREENIX Track, Boston, 2001. 185–198

  22. 22

    Seelam S R, Teller P J. Fairness and performance isolation: an analysis of disk scheduling algorithms. In: IEEE International Conference on Cluster Computing, Barcelona, 2006. 1–10

  23. 23

    Iyer S, Druschel P. Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O. In: Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP), Banff, 2001. 117–130

  24. 24

    Raina S. Virtual Shared Memory: A Survey of Techniques and Systems. University of Bristol Technical Report, 1992

  25. 25

    Zaharia M, Chowdhury M, Franklin M J, et al. Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, Berkeley, 2010. 10

Download references

Author information

Correspondence to Weimin Zheng.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhai, J., Zhang, F., Li, Q. et al. Characterizing and optimizing TPC-C workloads on large-scale systems using SSD arrays. Sci. China Inf. Sci. 59, 92104 (2016). https://doi.org/10.1007/s11432-015-5383-x

Download citation

Keywords

  • TPC-C benchmark
  • OLTP workload
  • performance analysis
  • performance optimization
  • high-end computers

关键词

  • TPC-C
  • 固态磁盘阵列
  • 性能分析
  • 在线事务处理