Dictionary-Symbolwise Flexible Parsing

  • Maxime Crochemore
  • Laura Giambruno
  • Alessio Langiu
  • Filippo Mignosi
  • Antonio Restivo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6460)

Abstract

Linear time optimal parsing algorithms are very rare in the dictionary based branch of the data compression theory. The most recent is the FlexibleParsing algorithm of Mathias and Shainalp that works when the dictionary is prefix closed and the encoding of dictionary pointers has a constant cost. We present the Dictionary − SymbolwiseFlexibleParsing algorithm that is optimal for prefix-closed dictionaries and any symbolwise compressor under some natural hypothesis. In the case of LZ78-alike algorithms with variable costs and any, linear as usual, symbolwise compressor can be implemented in linear time. In the case of LZ77-alike dictionaries and any symbolwise compressor it can be implemented in O(n logn) time. We further present some experimental results that show the effectiveness of the dictionary-symbolwise approach.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bell, T.C., Witten, I.H.: The relationship between greedy parsing and symbolwise text compression. J. ACM 41(4), 708–724 (1994)CrossRefMATHGoogle Scholar
  2. 2.
    Cohn, M., Khazan, R.: Parsing with prefix and suffix dictionaries. In: Data Compression Conference, pp. 180–189 (1996)Google Scholar
  3. 3.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)MATHGoogle Scholar
  4. 4.
    Ferragina, P., Nitto, I., Venturini, R.: On the bit-complexity of lempel-ziv compression. In: Proceedings of the Nineteenth Annual ACM -SIAM Symposium on Discrete Algorithms, SODA 2009, pp. 768–777. Society for Industrial and Applied Mathematics, Philadelphia (2009)CrossRefGoogle Scholar
  5. 5.
    Gzip’s Home Page, http://www.gzip.org
  6. 6.
    Hartman, A., Rodeh, M.: Optimal parsing of strings, pp. 155–167. Springer, Heidelberg (1985)MATHGoogle Scholar
  7. 7.
    Horspool, R.N.: The effect of non-greedy parsing in ziv-lempel compression methods. In: Data Compression Conference (1995)Google Scholar
  8. 8.
    Katajainen, J., Raita, T.: An approximation algorithm for space-optimal encoding of a text. Comput. J. 32(3), 228–237 (1989)CrossRefGoogle Scholar
  9. 9.
    Katajainen, J., Raita, T.: An analysis of the longest match and the greedy heuristics in text encoding. J. ACM 39(2), 281–294 (1992)CrossRefMATHGoogle Scholar
  10. 10.
    Katz, P.: Pkzip archiving tool (1989), http://en.wikipedia.org/wiki/pkzip
  11. 11.
    Kim, T.Y., Kim, T.: On-line optimal parsing in dictionary-based coding adaptive. Electronic Letters 34(11), 1071–1072 (1998)CrossRefGoogle Scholar
  12. 12.
    Klein, S.T.: Efficient optimal recompression. Comput. J. 40(2/3), 117–126 (1997)CrossRefGoogle Scholar
  13. 13.
    Mahoney, M.: Large text compression benchmark, http://mattmahoney.net/text/text.html
  14. 14.
    Martelock, C.: Rzm order-1 rolz compressor (April 2008), http://encode.ru/forums/index.php?action=vthread&forum=1&topic=647
  15. 15.
    Matias, Y., Rajpoot, N., Shainalp, S.C.: The effect of flexible parsing for dynamic dictionary-based data compression. ACM Journal of Experimental Algorithms 6, 10 (2001)CrossRefGoogle Scholar
  16. 16.
    Matias, Y., Shainalp, S.C.: On the optimality of parsing in dynamic dictionary based data compression. In: SODA, pp. 943–944 (1999)Google Scholar
  17. 17.
    Della Penna, G., Langiu, A., Mignosi, F., Ulisse, A.: Optimal parsing in dictionary-symbolwise data compression schemes (2006), http://www.di.univaq.it/mignosi/ulicompressor.php
  18. 18.
    Schuegraf, E.J., Heaps, H.S.: A comparison of algorithms for data base compression by use of fragments as language elements. Information Storage and Retrieval 10(9-10), 309–319 (1974)CrossRefMATHGoogle Scholar
  19. 19.
    Storer, J.A., Szymanski, T.G.: Data compression via textural substitution. J. ACM 29(4), 928–951 (1982)CrossRefMATHGoogle Scholar
  20. 20.
    Wagner, R.A.: Common phrases and minimum-space text storage. ACM Commun. 16(3), 148–152 (1973)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Maxime Crochemore
    • 1
    • 4
  • Laura Giambruno
    • 2
  • Alessio Langiu
    • 2
    • 4
  • Filippo Mignosi
    • 3
  • Antonio Restivo
    • 2
  1. 1.Dept. of Computer ScienceKing’s College LondonLondonUK
  2. 2.Dipartimento di Matematica e InformaticaUniversità di PalermoPalermoItaly
  3. 3.Dipartimento di InformaticaUniversità dell’AquilaL’AquilaItaly
  4. 4.Institut Gaspard-MongeUniversité Paris-EstFrance

Personalised recommendations