Skip to main content

Linear Time Lempel-Ziv Factorization: Simple, Fast, Small

  • Conference paper
Combinatorial Pattern Matching (CPM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7922))

Included in the following conference series:

Abstract

Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in many diverse applications, including data compression, text indexing, and pattern discovery. We describe new linear time LZ factorization algorithms, some of which require only 2nlogn + O(logn) bits of working space to factorize a string of length n. These are the most space efficient linear time algorithms to date, using n logn bits less space than any previous linear time algorithm. The algorithms are also simple to implement, very fast in practice, and amenable to streaming implementation.

This research is partially supported by Academy of Finland grants 118653 (ALGODAN) and 250345 (CoECGR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Hafeedh, A., Crochemore, M., Ilie, L., Kopylova, E., Smyth, W., Tischler, G., Yusufu, M.: A comparison of index-based Lempel-Ziv LZ77 factorization algorithms. ACM Comput. Surv. 45(1), 5:1–5:17 (2012)

    Article  Google Scholar 

  2. Charikar, M., Lehman, E., Liu, D., Panigrhy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Transactions on Information Theory 51(7), 2554–2576 (2005)

    Article  Google Scholar 

  3. Chen, G., Puglisi, S.J., Smyth, W.F.: Fast and practical algorithms for computing all the runs in a string. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol. 4580, pp. 307–315. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Crochemore, M., Ilie, L.: Computing longest previous factor in linear time and applications. Information Processing Letters 106(2), 75–80 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  5. Crochemore, M., Ilie, L., Iliopoulos, C.S., Kubica, M., Rytter, W., Waleń, T.: LPF computation revisited. In: Fiala, J., Kratochvíl, J., Miller, M. (eds.) IWOCA 2009. LNCS, vol. 5874, pp. 158–169. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. Crochemore, M., Ilie, L., Smyth, W.F.: A simple algorithm for computing the Lempel-Ziv factorization. In: DCC 2008, pp. 482–488. IEEE Computer Society (2008)

    Google Scholar 

  7. Gagie, T., Gawrychowski, P., Kärkkäinen, J., Nekrich, Y., Puglisi, S.J.: A faster grammar-based self-index. In: Dediu, A.-H., Martín-Vide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 240–251. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  8. Gagie, T., Gawrychowski, P., Puglisi, S.J.: Faster approximate pattern matching in compressed repetitive texts. In: Asano, T., Nakano, S.-i., Okamoto, Y., Watanabe, O. (eds.) ISAAC 2011. LNCS, vol. 7074, pp. 653–662. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Goto, K., Bannai, H.: Simpler and faster Lempel Ziv factorization. In: DCC 2013, pp. 133–142. IEEE Computer Society (2013)

    Google Scholar 

  10. Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Lightweight Lempel-Ziv parsing. In: Bonifaci, V. (ed.) SEA 2013. LNCS, vol. 7933, pp. 139–150. Springer, Heidelberg (2013)

    Google Scholar 

  11. Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. Journal of the ACM 53(6), 918–936 (2006)

    Article  MathSciNet  Google Scholar 

  12. Kärkkäinen, J., Manzini, G., Puglisi, S.J.: Permuted longest-common-prefix array. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009 Lille. LNCS, vol. 5577, pp. 181–192. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Kempa, D., Puglisi, S.J.: Lempel-Ziv factorization: simple, fast, practical. In: Zeh, N., Sanders, P. (eds.) ALENEX 2013, pp. 103–112. SIAM (2013)

    Google Scholar 

  14. Kreft, S., Navarro, G.: Self-indexing based on LZ77. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 41–54. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  15. Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Computing Surveys 39(1), article 2 (2007)

    Google Scholar 

  16. Ohlebusch, E., Gog, S.: Lempel-Ziv factorization revisited. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 15–26. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  17. Wu, F.: Sequential file prefetching in Linux. In: Wiseman, Y., Jiang, S. (eds.) Advanced Operating Systems and Kernel Applications: Techniques and Technologies, ch. 11, pp. 217–236. IGI Global (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kärkkäinen, J., Kempa, D., Puglisi, S.J. (2013). Linear Time Lempel-Ziv Factorization: Simple, Fast, Small. In: Fischer, J., Sanders, P. (eds) Combinatorial Pattern Matching. CPM 2013. Lecture Notes in Computer Science, vol 7922. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38905-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38905-4_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38904-7

  • Online ISBN: 978-3-642-38905-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics