Abstract
The f-factorization of a string is similar to the well-known Lempel-Ziv (LZ) factorization, but differs from it in that the factors must be non-overlapping. There are two linear time algorithms that compute the f-factorization. Both of them compute the array of longest previous non-overlapping factors (\(\mathsf {LPnF}\)-array), from which the f-factorization can easily be derived. In this paper, we present a simple algorithm that computes the \(\mathsf {LPnF}\)-array from the \(\mathsf {LPF}\)-array and an array \(\mathsf {prevOcc}\) that stores positions of previous occurrences of LZ-factors. The algorithm has a linear worst-case time complexity if \(\mathsf {prevOcc}\) contains leftmost positions. Moreover, we provide an algorithm that computes the f-factorization directly. Experiments show that our first method (combined with efficient \(\mathsf {LPF}\)-algorithms) is the fastest and our second method is the most space efficient way to compute the f-factorization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In the implementation, T is terminated by a special (EOF) symbol.
- 2.
- 3.
References
Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. J. Discrete Algorithms 2(1), 53–86 (2004)
Chen, G., Puglisi, S.J., Smyth, W.F.: Lempel-Ziv factorization using less time & space. Math. Comput. Sci. 1(4), 605–623 (2008)
Crochemore, M., Ilie, L.: Computing longest previous factor in linear time and applications. Inf. Process. Lett. 106(2), 75–80 (2008)
Crochemore, M., Ilie, L., Iliopoulos, C.S., Kubica, M., Rytter, W., WaleÅ„, T.: LPF computation revisited. In: Fiala, J., KratochvÃl, J., Miller, M. (eds.) IWOCA 2009. LNCS, vol. 5874, pp. 158–169. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10217-2_18
Crochemore, M., Kubica, M., Iliopoulos, C.S., Rytter, W., Waleń, T.: Efficient algorithms for three variants of the LPF table. J. Discrete Algorithms 11, 51–61 (2012)
Crochemore, M., Tischler, G.: Computing longest previous non-overlapping factors. Inf. Process. Lett. 111, 291–295 (2011)
Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of 41st Annual IEEE Symposium on Foundations of Computer Science, pp. 390–398 (2000)
Fine, N.J., Wilf, H.S.: Uniqueness theorem for periodic functions. Proc. Am. Math. Soc. 16, 109–114 (1965)
Fischer, J., Heun, V.: Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM J. Comput. 40(2), 465–492 (2011)
Fischer, J., I, T., Köppl, D., Sadakane, K.: Lempel-Ziv factorization powered by space efficient suffix trees. Algorithmica 80(7), 2048–2081 (2018)
Gog, S., Beller, T., Moffat, A., Petri, M.: From theory to practice: plug and play with succinct data structures. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 326–337. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07959-2_28
Goto, K., Bannai, H.: Simpler and faster Lempel Ziv factorization. In: Proceedings of 23rd Data Compression Conference, pp. 133–142. IEEE Computer Society (2013)
Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Linear time Lempel-Ziv factorization: simple, fast, small. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 189–200. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38905-4_19
Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Lazy Lempel-Ziv factorization algorithms. ACM J. Exp. Algorithmics 21(2), Article 2.4 (2016)
Kociumaka, T., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: A linear time algorithm for seeds computation. In: Proceedings of 23rd Symposium on Discrete Algorithms, pp. 1095–1112 (2012)
Kolpakov, R., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: Proceedings of 40th Annual IEEE Symposium on Foundations of Computer Science, pp. 596–604 (1999)
Kolpakov, R., Kucherov, G.: Searching for gapped palindromes. Theor. Comput. Sci. 410(51), 5365–5373 (2009)
Ohlebusch, E., Gog, S.: Lempel-Ziv factorization revisited. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 15–26. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21458-5_4
Policriti, A., Prezza, N.: LZ77 computation based on the run-length encoded BWT. Algorithmica 80(7), 1986–2011 (2018)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ohlebusch, E., Weber, P. (2019). On the Computation of Longest Previous Non-overlapping Factors. In: Brisaboa, N., Puglisi, S. (eds) String Processing and Information Retrieval. SPIRE 2019. Lecture Notes in Computer Science(), vol 11811. Springer, Cham. https://doi.org/10.1007/978-3-030-32686-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-32686-9_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32685-2
Online ISBN: 978-3-030-32686-9
eBook Packages: Computer ScienceComputer Science (R0)