Needles in the Haystack — Tackling Bit Flips in Lightweight Compressed Data

  • Till Kolditz
  • Dirk Habich
  • Dmitrii Kuvaiskii
  • Wolfgang Lehner
  • Christof Fetzer
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 584)

Abstract

Modern database systems are very often in the position to store their entire data in main memory. Aside from increased main emory capacities, a further driver for in-memory database system has been the shift to a column-oriented storage format in combination with lightweight data compression techniques. Using both mentioned software concepts, large datasets can be held and efficiently processed in main memory with a low memory footprint. Unfortunately, hardware becomes more and more vulnerable to random faults, so that e.g., the probability rate for bit flips in main memory increases, and this rate is likely to escalate in future dynamic random-access memory (DRAM) modules. Since the data is highly compressed by the lightweight compression algorithms, multi bit flips will have an extreme impact on the reliability of database systems. To tackle this reliability issue, we introduce our research on error resilient lightweight data compression algorithms in this paper. Of course, our software approach lacks the efficiency of hardware realization, but its flexibility and adaptability will play a more important role regarding differing error rates, e.g. due to hardware aging effects and aggressive processor voltage and frequency scaling. Arithmetic AN encoding is one family of codes which is an interesting candidate for effective software-based error detection. We present results of our research showing tradeoffs between compressibility and resiliency characteristics of data. We show that particular choices of the AN-code parameter lead to a moderate loss of performance. We provide evaluation for two proposed techniques, namely AN-encoded Null Suppression and AN-encoded Run Length Encoding.

References

  1. 1.
    Abadi, D., Madden, S., Ferreira, M.: Integrating compression and execution in column-oriented database systems. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 671–682 (2006)Google Scholar
  2. 2.
    Antoshenkov, G., Lomet, D.B., Murray, J.: Order preserving compression. In: Proceedings of the Twelfth International Conference on Data Engineering, ICDE 1996, pp. 655–663 (1996)Google Scholar
  3. 3.
    Bassiouni, M.A.: Data compression in scientific and statistical databases. IEEE Trans. Softw. Eng. 11(10), 1047–1058 (1985)CrossRefGoogle Scholar
  4. 4.
    Bohannon, P., Rastogi, R., Seshadri, S., Silberschatz, A., Sudarshan, S.: Detection and recovery techniques for database corruption. IEEE Trans. Knowl. Data Eng. 15(5), 1120–1136 (2003)CrossRefGoogle Scholar
  5. 5.
    Boncz, P.A., Manegold, S., Kersten, M.L.: Database architecture optimized for the new bottleneck: memory access. In: Proceedings of the 25th International Conference on Very Large Data Bases, VLDB 1999, pp. 54–65 (1999)Google Scholar
  6. 6.
    Chen, Z., Gehrke, J., Korn, F.: Query optimization in compressed database systems. SIGMOD Rec. 30(2), 271–282 (2001)CrossRefGoogle Scholar
  7. 7.
    Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: Proceedings of 14th International Conference on Data Engineering, pp. 370–379, February 1998Google Scholar
  8. 8.
    Graefe, G., Kuno, H., Seeger, B.: Self-diagnosing and self-healing indexes. In: DBTest, pp. 8:1–8:8 (2012)Google Scholar
  9. 9.
    Graefe, G., Stonecipher, R.: Efficient verification of b-tree integrity. In: BTW, pp. 27–46 (2009)Google Scholar
  10. 10.
    Hildebrandt, J., Habich, D., Damme, P., Lehner, W.: Modularization of lightweight data compression algorithms. Technical report, Department of Computer Science, Technische Universität Dresden, November 2015. https://wwwdb.inf.tu-dresden.de/misc/team/habich/dcc2016.pdf. submitted to DCC 2016
  11. 11.
    Hoffmann, M., Ulbrich, P., Dietrich, C., Schirmeier, H., Lohmann, D., Schröder-Preikschat, W.: A practitioner’s guide to software-based soft-error mitigation using AN-codes. In: HASE 2014, pp. 33–40 (2014)Google Scholar
  12. 12.
    Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proc. Inst. Radio Eng. 40(9), 1098–1101 (1952)Google Scholar
  13. 13.
    Hwang, A.A., Stefanovici, I.A., Schroeder, B.: Cosmic rays don’t strike twice: understanding the nature of DRAM errors and the implications for system design. SIGARCH Comput. Archit. News 40(1), 111–122 (2012)CrossRefGoogle Scholar
  14. 14.
    Kissinger, T., Kiefer, T., Schlegel, B., Habich, D., Molka, D., Lehner, W.: ERIS: a numa-aware in-memory storage engine for analytical workload. In: International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures - ADMS, pp. 74–85 (2014)Google Scholar
  15. 15.
    Kolditz, T., Kissinger, T., Schlegel, B., Habich, D., Lehner, W.: Online bit flip detection for in-memory b-trees on unreliable hardware. In: DaMoN, pp. 5:1–5:9 (2014)Google Scholar
  16. 16.
    Lehman, T.J., Carey, M.J.: Query processing in main memory database management systems. In: Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, SIGMOD 1986, pp. 239–250 (1986)Google Scholar
  17. 17.
    Lehner, W.: Energy-efficient in-memory database computing. In: Design, Automation and Test in Europe, DATE 13, Grenoble, France, 18–22 March 2013, pp. 470–474 (2013)Google Scholar
  18. 18.
    Lemire, D., Boytsov, L.: Decoding billions of integers per second through vectorization. CoRR abs/1209.2137 (2012)Google Scholar
  19. 19.
    May, T.C., Woods, M.H.: Alpha-particle-induced soft errors in dynamic memories. IEEE Trans. Electron Devices 26(1), 2–9 (1979)CrossRefGoogle Scholar
  20. 20.
    Moon, T.K.: Error Correction Coding: Mathematical Methods and Algorithms. Wiley, Hoboken (2005)CrossRefGoogle Scholar
  21. 21.
    Reghbati, H.K.: An overview of data compression techniques. IEEE Comput. 14(4), 71–75 (1981)CrossRefGoogle Scholar
  22. 22.
    Roth, M.A., Van Horn, S.J.: Database compression. SIGMOD Rec. 22(3), 31–39 (1993)CrossRefGoogle Scholar
  23. 23.
    Schiffel, U.: Hardware Error Detection Using AN-Codes. Ph.D. thesis, Technische Universität Dresden (2011)Google Scholar
  24. 24.
    Schlegel, B., Gemulla, R., Lehner, W.: Fast integer compression using simd instructions. In: DaMoN. pp. 34–40 (2010)Google Scholar
  25. 25.
    Schroeder, B., Gibson, G.A.: A large-scale study of failures in high performance-computing systems. Dependable Secure Comput. 7(4), 337–350 (2010)CrossRefGoogle Scholar
  26. 26.
    Schroeder, B., Pinheiro, E., Weber, W.D.: Dram errors in the wild: a large-scale field study. In: Proceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2009, pp. 193–204 (2009)Google Scholar
  27. 27.
    Stepanov, A.A., Gangolli, A.R., Rose, D.E., Ernst, R.J., Oberoi, P.S.: Simd-based decoding of posting lists. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM 2011, pp. 317–326 (2011)Google Scholar
  28. 28.
    Stonebraker, M.: Technical perspective - one size fits all: an idea whose time has come and gone. Commun. ACM 51(12), 76 (2008)CrossRefGoogle Scholar
  29. 29.
    Sullivan, M., Stonebraker, M.: Using write protected data structures to improve software fault tolerance in highly available database management systems. In: VLDB, pp. 171–180 (1991)Google Scholar
  30. 30.
    Warren, H.S.: Hacker’s Delight. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA (2002)Google Scholar
  31. 31.
    Witten, I.H., Neal, R.M., Cleary, J.G.: Arithmetic coding for data compression. Commun. ACM 30(6), 520–540 (1987)CrossRefGoogle Scholar
  32. 32.
    Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theor. 23(3), 337–343 (1977)CrossRefMathSciNetMATHGoogle Scholar
  33. 33.
    Zukowski, M., Heman, S., Nes, N., Boncz, P.: Super-scalar ram-cpu cache compression. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, pp. 59–59, April 2006Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Till Kolditz
    • 1
  • Dirk Habich
    • 1
  • Dmitrii Kuvaiskii
    • 2
  • Wolfgang Lehner
    • 1
  • Christof Fetzer
    • 2
  1. 1.Technische Universität Dresden, Database Systems GroupDresdenGermany
  2. 2.Technische Universität Dresden, Systems Engineering GroupDresdenGermany

Personalised recommendations