Chapter

Selected Topics in Performance Evaluation and Benchmarking

Volume 7755 of the series Lecture Notes in Computer Science pp 108-123

BDMS Performance Evaluation: Practices, Pitfalls, and Possibilities

  • Michael J. CareyAffiliated withInformation Systems Group, Computer Sciences Department, University of California, Irvine

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Much of the IT world today is buzzing about Big Data, and we are witnessing the emergence of a new generation of data-oriented platforms aimed at storing and processing all of the anticipated Big Data. The current generation of Big Data Management Systems (BDMSs) can largely be divided into two kinds of platforms: systems for Big Data analytics, which today tend to be batch-oriented and based on MapReduce (e.g., Hadoop), and systems for Big Data storage and front-end request-serving, which are usually based on key-value (a.k.a. NoSQL) stores. In this paper we ponder the problem of evaluating the performance of such systems. After taking a brief historical look at Big Data management and DBMS benchmarking, we begin our pondering of BDMS performance evaluation by reviewing several key recent efforts to measure and compare the performance of BDMSs. Next we discuss a series of potential pitfalls that such evaluation efforts should watch out for, pitfalls mostly based on the author’s own experiences with past benchmarking efforts. Finally, we close by discussing some of the unmet needs and future possibilities with regard to BDMS performance characterization and assessment efforts.

Keywords

Data-intensive computing Big Data performance benchmarking MapReduce Hadoop key-value stores NoSQL systems