DBMS Data Loading: An Analysis on Modern Hardware

  • Adam Dziedzic
  • Manos Karpathiotakis
  • Ioannis Alagiannis
  • Raja Appuswamy
  • Anastasia Ailamaki
Conference paper

DOI: 10.1007/978-3-319-56111-0_6

Part of the Lecture Notes in Computer Science book series (LNCS, volume 10195)
Cite this paper as:
Dziedzic A., Karpathiotakis M., Alagiannis I., Appuswamy R., Ailamaki A. (2017) DBMS Data Loading: An Analysis on Modern Hardware. In: Blanas S., Bordawekar R., Lahiri T., Levandoski J., Pavlo A. (eds) Data Management on New Hardware. IMDM 2016, ADMS 2016. Lecture Notes in Computer Science, vol 10195. Springer, Cham

Abstract

Data loading has traditionally been considered a “one-time deal” – an offline process out of the critical path of query execution. The architecture of DBMS is aligned with this assumption. Nevertheless, the rate in which data is produced and gathered nowadays has nullified the “one-off” assumption, and has turned data loading into a major bottleneck of the data analysis pipeline.

This paper analyzes the behavior of modern DBMS in order to quantify their ability to fully exploit multicore processors and modern storage hardware during data loading. We examine multiple state-of-the-art DBMS, a variety of hardware configurations, and a combination of synthetic and real-world datasets to identify bottlenecks in the data loading process and to provide guidelines on how to accelerate data loading. Our findings show that modern DBMS are unable to saturate the available hardware resources. We therefore identify opportunities to accelerate data loading.

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Adam Dziedzic
    • 1
  • Manos Karpathiotakis
    • 2
  • Ioannis Alagiannis
    • 2
  • Raja Appuswamy
    • 2
  • Anastasia Ailamaki
    • 2
    • 3
  1. 1.University of ChicagoChicagoUSA
  2. 2.Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
  3. 3.RAW Labs SALausanneSwitzerland

Personalised recommendations