International Conference on Data Management Technologies and Applications

DATA 2015: Data Management Technologies and Applications pp 135-153

Needles in the Haystack — Tackling Bit Flips in Lightweight Compressed Data

  • Till Kolditz
  • Dirk Habich
  • Dmitrii Kuvaiskii
  • Wolfgang Lehner
  • Christof Fetzer
Conference paper

DOI: 10.1007/978-3-319-30162-4_9

Volume 584 of the book series Communications in Computer and Information Science (CCIS)
Cite this paper as:
Kolditz T., Habich D., Kuvaiskii D., Lehner W., Fetzer C. (2016) Needles in the Haystack — Tackling Bit Flips in Lightweight Compressed Data. In: Helfert M., Holzinger A., Belo O., Francalanci C. (eds) Data Management Technologies and Applications. DATA 2015. Communications in Computer and Information Science, vol 584. Springer, Cham

Abstract

Modern database systems are very often in the position to store their entire data in main memory. Aside from increased main emory capacities, a further driver for in-memory database system has been the shift to a column-oriented storage format in combination with lightweight data compression techniques. Using both mentioned software concepts, large datasets can be held and efficiently processed in main memory with a low memory footprint. Unfortunately, hardware becomes more and more vulnerable to random faults, so that e.g., the probability rate for bit flips in main memory increases, and this rate is likely to escalate in future dynamic random-access memory (DRAM) modules. Since the data is highly compressed by the lightweight compression algorithms, multi bit flips will have an extreme impact on the reliability of database systems. To tackle this reliability issue, we introduce our research on error resilient lightweight data compression algorithms in this paper. Of course, our software approach lacks the efficiency of hardware realization, but its flexibility and adaptability will play a more important role regarding differing error rates, e.g. due to hardware aging effects and aggressive processor voltage and frequency scaling. Arithmetic AN encoding is one family of codes which is an interesting candidate for effective software-based error detection. We present results of our research showing tradeoffs between compressibility and resiliency characteristics of data. We show that particular choices of the AN-code parameter lead to a moderate loss of performance. We provide evaluation for two proposed techniques, namely AN-encoded Null Suppression and AN-encoded Run Length Encoding.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Till Kolditz
    • 1
  • Dirk Habich
    • 1
  • Dmitrii Kuvaiskii
    • 2
  • Wolfgang Lehner
    • 1
  • Christof Fetzer
    • 2
  1. 1.Technische Universität Dresden, Database Systems GroupDresdenGermany
  2. 2.Technische Universität Dresden, Systems Engineering GroupDresdenGermany