Practical Evaluation of Lempel-Ziv-78 and Lempel-Ziv-Welch Tries

  • Johannes Fischer
  • Dominik KöpplEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10508)


We present the first thorough practical study of the Lempel-Ziv-78 and the Lempel-Ziv-Welch computation based on trie data structures. With a careful selection of trie representations we can beat well-tuned popular trie data structures like Judy, m-Bonsai or Cedar.


Lempel-Ziv compression Dynamic tries Hashing 



We are grateful to Marvin Löbel for providing the basement of the LZ78/LZW framework in tudocomp. Further, we thank Andreas Poyias for sharing the source code of the m-Bonsai trie [34] and the compact hash table [33].


  1. 1.
    Arroyuelo, D., Navarro, G.: Space-efficient construction of Lempel-Ziv compressed text indexes. Inf. Comput. 209(7), 1070–1102 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Arz, J., Fischer, J.: LZ-compressed string dictionaries. In: Proceedings of the DCC, pp. 322–331. IEEE Press (2014)Google Scholar
  3. 3.
    Askitis, N.: Fast and compact hash tables for integer keys. In: Proceedings of the ACSC. CRPIT, vol. 91, pp. 101–110. Australian Computer Society (2009)Google Scholar
  4. 4.
    Bannai, H., Inenaga, S., Takeda, M.: Efficient LZ78 factorization of grammar compressed text. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 86–98. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-34109-0_10 CrossRefGoogle Scholar
  5. 5.
    Belazzougui, D., Puglisi, S.J.: Range predecessor and Lempel-Ziv parsing. In: Proceedings of the SODA, pp. 2053–2071. SIAM (2016)Google Scholar
  6. 6.
    Bentley, J.L., Sedgewick, R.: Fast algorithms for sorting and searching strings. In: Proceedings of the SODA, pp. 360–369. ACM/SIAM (1997)Google Scholar
  7. 7.
    Black, J.R., Martel, C.U., Qi, H.: Graph and hashing algorithms for modern architectures: design and performance. In: Proceedings of the WAE, pp. 37–48. Max-Planck-Institut für Informatik (1998)Google Scholar
  8. 8.
    Carter, L., Wegman, M.N.: Universal classes of hash functions. J. Comput. Syst. Sci. 18(2), 143–154 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Chung, K., Mitzenmacher, M., Vadhan, S.P.: Why simple hash functions work: exploiting the entropy in a data stream. Theor. Comput. 9, 897–945 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Cleary, J.G.: Compact hash tables using bidirectional linear probing. IEEE Trans. Comput. 33(9), 828–834 (1984)CrossRefzbMATHGoogle Scholar
  11. 11.
    Dinklage, P., Fischer, J., Köppl, D., Löbel, M., Sadakane, K.: Compression with the tudocomp framework. In: Proceedings of the SEA. LIPIcs, vol. 75, pp. 13:1–13:22 (2017)Google Scholar
  12. 12.
    Feldman, J.A., Low, J.R.: Comment on Brent’s scatter storage algorithm. Commun. ACM 16(11), 703 (1973)CrossRefGoogle Scholar
  13. 13.
    Fischer, J., Gawrychowski, P.: Alphabet-dependent string searching with wexponential search trees. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 160–171. Springer, Cham (2015). doi: 10.1007/978-3-319-19929-0_14 CrossRefGoogle Scholar
  14. 14.
    Fischer, J., I, T., Köppl, D.: Lempel Ziv computation in small space (LZ-CISS). In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 172–184. Springer, Cham (2015). doi: 10.1007/978-3-319-19929-0_15 CrossRefGoogle Scholar
  15. 15.
    Gonnet, G.H., Baeza-Yates, R.A.: An analysis of the Karp-Rabin string matching algorithm. Inf. Process. Lett. 34(5), 271–274 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Heileman, G.L., Luo, W.: How caching affects hashing. In: Proceedings of the ALENEX, pp. 141–154. SIAM (2005)Google Scholar
  17. 17.
    Jansson, J., Sadakane, K., Sung, W.K.: Linked dynamic tries with applications to LZ-compression in sublinear time and space. Algorithmica 71(4), 969–988 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Lazy Lempel-Ziv factorization algorithms. ACM J. Exp. Algorithmics 21(1), 2.4:1–2.4:19 (2016)Google Scholar
  19. 19.
    Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Lightweight Lempel-Ziv parsing. In: Bonifaci, V., Demetrescu, C., Marchetti-Spaccamela, A. (eds.) SEA 2013. LNCS, vol. 7933, pp. 139–150. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38527-8_14 CrossRefGoogle Scholar
  20. 20.
    Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Knuth, D.: Sorting and Searching, the Art of Computer Programming, vol. III. Addison-Wesley, Reading (1973)Google Scholar
  22. 22.
    Köppl, D., Sadakane, K.: Lempel-Ziv computation in compressed space. In: Proceedings of the DCC, pp. 3–12. IEEE Press (2016)Google Scholar
  23. 23.
    Lemire, D.: The universality of iterated hashing over variable-length strings. Discrete Appl. Math. 160(4–5), 604–617 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Lemire, D., Kaser, O.: Recursive \(n\)-gram hashing is pairwise independent, at best. Comput. Speech Lang. 24(4), 698–710 (2010)CrossRefGoogle Scholar
  25. 25.
    Lemire, D., Kaser, O.: Faster 64-bit universal hashing using carry-less multiplications. J. Cryptographic Eng. 6(3), 171–185 (2016)CrossRefGoogle Scholar
  26. 26.
    Luan, H., Du, X., Wang, S., Ni, Y., Chen, Q.: J\(^{+}\)-Tree: a new index structure in main memory. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 386–397. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-71703-4_34 CrossRefGoogle Scholar
  27. 27.
    Maier, T., Sanders, P.: Dynamic Space Efficient Hashing. ArXiv CoRR 1705.00997 (2017)Google Scholar
  28. 28.
    Marsaglia, G.: Xorshift RNGs. J. Stat. Softw. 8(14), 1–6 (2003)CrossRefGoogle Scholar
  29. 29.
    McIlroy, M.D.: A research UNIX reader: annotated excerpts from the programmer’s manual, 1971–1986. Technical report CSTR 139, AT&T Bell Laboratories (1987)Google Scholar
  30. 30.
    Munro, J.I., Navarro, G., Nekrich, Y.: Space-efficient construction of compressed indexes in deterministic linear time. In: Proceedings of the SODA, pp. 408–424. SIAM (2017)Google Scholar
  31. 31.
    Nakashima, Y.: I, T., Inenaga, S., Bannai, H., Takeda, M.: Constructing LZ78 tries and position heaps in linear time for large alphabets. Inform. Process. Lett. 115(9), 655–659 (2015)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Navarro, G.: Implementing the LZ-index: theory versus practice. ACM J. Exp. Algorithmics 13(2), 2:1.1–2:1.49 (2008)Google Scholar
  33. 33.
    Poyias, A., Puglisi, S.J., Raman, R.: Compact dynamic rewritable (CDRW) arrays. In: Proceedings of the ALENEX, pp. 109–119. SIAM (2017)Google Scholar
  34. 34.
    Poyias, A., Raman, R.: Improved practical compact dynamic tries. In: Iliopoulos, C., Puglisi, S., Yilmaz, E. (eds.) SPIRE 2015. LNCS, vol. 9309, pp. 324–336. Springer, Cham (2015). doi: 10.1007/978-3-319-23826-5_31 CrossRefGoogle Scholar
  35. 35.
    Robinson, R.M.: Mersenne and Fermat numbers. Proc. Amer. Math. Soc. 5(5), 842–846 (1954)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Steele Jr., G.L., Lea, D., Flood, C.H.: Fast splittable pseudorandom number generators. In: Proc. OOPSLA. pp. 453–472. ACM (2014)Google Scholar
  37. 37.
    Tchebychev, P.: Mémoire sur les nombres premiers. J. de mathématiques pures et appliquées 1, 366–390 (1852)Google Scholar
  38. 38.
    Welch, T.A.: A technique for high-performance data compression. IEEE Computer 17(6), 8–19 (1984)CrossRefGoogle Scholar
  39. 39.
    Yoshinaga, N., Kitsuregawa, M.: A self-adaptive classifier for efficient text-stream processing. In: Proceedings of the COLING, pp. 1091–1102. ACL (2014)Google Scholar
  40. 40.
    Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inform. Theory 23(3), 337–343 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable length coding. IEEE Trans. Inform. Theory 24(5), 530–536 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Zobrist, A.L.: A new hashing method with application for game playing. Technical report 88, Computer Sciences Department, University of Wisconsin (1970)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceTU DortmundDortmundGermany

Personalised recommendations