TPCx-HS is the first industry standard Big Data benchmark for objectively measuring the performance and price/performance of Apache Hadoop and Apache Spark compatible software distributions. It stresses both the hardware and software stacks including the operating system, execution engine (MapReduce or Spark), and Hadoop Filesystem API compatible layers. TPCx-HS can be used to assess a broad range of system topologies and implementation methodologies in a technically rigorous and directly comparable, vendor-neutral manner.
Up until 2007, Jim Gray defined, sponsored, and administered a number of sort benchmarks (Anon et al. 1985) available to the general community. These include Minute Sort, Gray Sort, Penny Sort, Joule Sort, Datamation Sort, and TeraByte Sort. TeraByte Sort measures the amount of time taken (in minutes) to sort 1 TB (1012 bytes) of data.
In 2009, Owen O’Malley and others...
- Anon et al (1985) A measure of transaction processing power, a condensed version of this paper appears in Datamation, April 1, 1985. This paper was scanned from the Tandem Technical Report TR 85.2 in 2001 and reformatted by Jim GrayGoogle Scholar
- Huppler K, Johnson D (2013) TPC express – a new path for TPC benchmarks. TPCTC, pp 48–60Google Scholar
- Nambiar R, Wakou N, Masland A, Thawley P, Lanken M, Carman F, Majdalany M (2011) Shaping the landscape of industry standard benchmarks: contributions of the Transaction Processing Performance Council (TPC). TPCTC¸ pp 1–9Google Scholar
- TPC Energy (2018) http://www.tpc.org/tpc_energy/default.asp. Last accessed 17 Jan 2018
- TPC Pricing (2018) http://www.tpc.org/pricing/default.asp. Last accessed 17 Jan 2018
- TPCx-HS Benchmark (2018) http://www.tpc.org/tpcx-hs/default.asp?version=2. Last accessed 17 Jan 2018