The VLDB Journal

, Volume 26, Issue 1, pp 31–54

Resource bricolage and resource selection for parallel database systems

  • Jiexing Li
  • Jeffrey F. Naughton
  • Rimma V. Nehme
Special Issue Paper

DOI: 10.1007/s00778-016-0435-4

Cite this article as:
Li, J., Naughton, J.F. & Nehme, R.V. The VLDB Journal (2017) 26: 31. doi:10.1007/s00778-016-0435-4
  • 148 Downloads

Abstract

Running parallel database systems in an environment with heterogeneous resources has become increasingly common, due to cluster evolution and increasing interest in moving applications into public clouds. Performance differences among machines in the same cluster pose new challenges for parallel database systems. First, for database systems running in a heterogeneous cluster, the default uniform data partitioning strategy may overload some of the slow machines, while at the same time it may underutilize the more powerful machines. Since the processing time of a parallel query is determined by the slowest machine, such an allocation strategy may result in a significant query performance degradation. Second, since machines might have varying resources or performance, different choices of machines may lead to different costs or performance for executing the same workload. By carefully selecting the most suitable machines for running a workload, we may achieve better performance with the same budget, or we may meet the same performance requirements with a lower cost. We address these challenges by introducing techniques we call resource bricolage and resource selection that improve database performance in heterogeneous environments. Our approaches quantify the performance differences among machines with various resources as they process workloads with diverse resource requirements. For the purpose of better resource utilization, we formalize the problem of minimizing workload execution time and view it as an optimization problem, and then, we employ linear programming to obtain a recommended data partitioning scheme. For the purpose of better resource selection, we formalize two problems: One minimizes the total workload execution time with a given budget, and the other minimizes the total budget with a given performance target. We then employ different mixed-integer programs to search for the optimal resource selection decisions. We verify the effectiveness of both resource bricolage and resource selection techniques with an extensive experimental study.

Keywords

Resource bricolage Resource selection Parallel database systems Heterogeneous clusters Performance prediction Data partitioning 

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Jiexing Li
    • 1
  • Jeffrey F. Naughton
    • 2
  • Rimma V. Nehme
    • 3
  1. 1.Google IncMountain ViewUSA
  2. 2.Department of Computer SciencesUniversity of Wisconsin, MadisonMadisonUSA
  3. 3.Microsoft Jim Gray Systems LabMadisonUSA

Personalised recommendations