Skip to main content

Approximating LZ77 via Small-Space Multiple-Pattern Matching

  • Conference paper
  • First Online:
Algorithms - ESA 2015

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9294))

Abstract

We generalize Karp-Rabin string matching to handle multiple patterns in \(\mathcal{O}(n \log n + m)\) time and \(\mathcal{O}(s)\) space, where n is the length of the text and m is the total length of the s patterns, returning correct answers with high probability. As a prime application of our algorithm, we show how to approximate the LZ77 parse of a string of length n. If the optimal parse consists of z phrases, using only \(\mathcal{O}(z)\) working space we can return a parse consisting of at most 2z phrases in \(\mathcal{O}(n\log n)\) time, and a parse of at most (1 + ε)z phrases in \(\mathcal{O}(n\log^{2}n)\) for any constant ε > 0. As previous quasilinear-time algorithms for LZ77 use Ω(n/polylogn) space, but z can be exponentially small in n, these improvements in space consumption are substantial.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rabin-Karp algorithm — Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/w/index.php?title=Rabin-Karp_algorithm&oldid=665980736

  2. Aho, A.V., Corasick, M.J.: Efficient string matching: An aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  3. Breslauer, D., Grossi, R., Mignosi, F.: Simple real-time constant-space string matching. Theor. Comput. Sci. 483, 2–9 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  4. Clifford, R., Fontaine, A., Porat, E., Sach, B., Starikovskaya, T.: Dictionary matching in a stream. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 8737, pp. ??–?? Springer, Heidelberg (2015)

    Google Scholar 

  5. Crochemore, M., Perrin, D.: Two-way string matching. J. ACM 38(3), 651–675 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  6. Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. P. Am. Math. Soc. 16(1), 109–114 (1965)

    Article  MathSciNet  MATH  Google Scholar 

  7. Fischer, J., Gagie, T., Gawrychowski, P., Kociumaka, T.: Approximating LZ77 via small-space multiple-pattern matching. CoRR abs/1504.06647 (2015)

    Google Scholar 

  8. Fischer, J., I, T., Köppl, D.: Lempel Ziv Computation in Small Space (LZ-CISS). In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 172–184. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  9. Galil, Z., Seiferas, J.I.: Time-space-optimal string matching. J. Comput. Syst. Sci. 26(3), 280–294 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  10. Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient algorithms for Lempel-Ziv encoding (extended abstract). In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  11. Gum, B., Lipton, R.J.: Cheaper by the dozen: Batched algorithms. In: Kumar, V., Grossman, R.L. (eds.) SDM 2001, pp. 1–11. SIAM, Philadelphia (2001)

    Google Scholar 

  12. Hon, W., Ku, T., Shah, R., Thankachan, S.V., Vitter, J.S.: Faster compressed dictionary matching. Theor. Comput. Sci. 475, 113–119 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  13. Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Lightweight Lempel-Ziv parsing. In: Bonifaci, V., Demetrescu, C., Marchetti-Spaccamela, A. (eds.) SEA 2013. LNCS, vol. 7933, pp. 139–150. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  14. Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kociumaka, T., Starikovskaya, T., Vildhøj, H.W.: Sublinear space algorithms for the longest common substring problem. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, pp. 605–617. Springer, Heidelberg (2014)

    Google Scholar 

  16. Lohrey, M.: Algorithmics on SLP-compressed strings: A survey. Groups Complexity Cryptology 4(2), 241–299 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  17. Ružić, M.: Constructing efficient dictionaries in close to sorting time. In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, pp. 84–95. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  19. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inform. Theory 23(3), 337–343 (1977)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Fischer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fischer, J., Gagie, T., Gawrychowski, P., Kociumaka, T. (2015). Approximating LZ77 via Small-Space Multiple-Pattern Matching. In: Bansal, N., Finocchi, I. (eds) Algorithms - ESA 2015. Lecture Notes in Computer Science(), vol 9294. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48350-3_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-48350-3_45

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-48349-7

  • Online ISBN: 978-3-662-48350-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics