Abstract
The suffix array, a space efficient alternative to the suffix tree, is an important data structure for string processing, enabling efficient and often optimal algorithms for pattern matching, data compression, repeat finding and many problems arising in computational biology. An essential augmentation to the suffix array for many of these tasks is the Longest Common Prefix (LCP) array. In particular the LCP array allows one to simulate bottom-up and top-down traversals of the suffix tree with significantly less memory overhead (but in the same time bounds). Since 2001 the LCP array has been computable in Θ(n) time, but the algorithm (even after subsequent refinements) requires relatively large working memory. In this paper we describe a new algorithm that provides a continuous space-time tradeoff for LCP array construction, running in O(nv) time and requiring \(n + O(n/\sqrt{v} + v)\) bytes of working space, where v can be chosen to suit the available memory. Furthermore, the algorithm processes the suffix array, and outputs the LCP, strictly left-to-right, making it suitable for use with external memory. We show experimentally that for many naturally occurring strings our algorithm is faster than the linear time algorithms, while using significantly less working memory.
This work is supported by the Australian Research Council.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms 2(1), 53–86 (2004)
Apostolico, A.: The myriad virtues of subword trees. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. NATO ASI Series F12, pp. 85–96. Springer, Heidelberg (1985)
Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G., Panario, D., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)
Burkhardt, S., Kärkkäinen, J.: Fast lightweight suffix array construction and checking. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 55–69. Springer, Heidelberg (2003)
Colbourn, C.J., Ling, A.C.H.: Quorums from difference covers. Information Processing Letters 75(1–2), 9–12 (2000)
Fischer, J., Heun, V.: Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 36–48. Springer, Heidelberg (2006)
Fischer, J., Heun, V.: A new succinct representation of RMQ-information and improvements in the enhanced suffix array. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 459–470. Springer, Heidelberg (2007)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Kasai, T., Lee, G., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)
Manber, U., Myers, G.W.: Suffix arrays: a new method for on-line string searches. SIAM Journal of Computing 22(5), 935–948 (1993)
Manzini, G.: An analysis of the Burrows-Wheeler transform. Journal of the ACM 48(3), 407–430 (2001)
Manzini, G.: Two space saving tricks for linear time LCP computation. In: Hagerup, T., Katajainen, J. (eds.) SWAT 2004. LNCS, vol. 3111, pp. 372–383. Springer, Heidelberg (2004)
McCreight, E.M.: A space-economical suffix tree construction algroithm. Journal of the ACM 23(2), 262–272 (1976)
Puglisi, S.J., Smyth, W.F., Turpin, A.: A taxonomy of suffix array construction algorithms. ACM Computing Surveys 39(2), 1–31 (2007)
Smyth, B.: Computing Patterns in Strings. Pearson Addison-Wesley, Essex (2003)
Ukkonen, E.: Online construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Weiner, P.: Linear pattern matching algorithms. In: Proceedings of the 14th annual Symposium on Foundations of Computer Science, pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Puglisi, S.J., Turpin, A. (2008). Space-Time Tradeoffs for Longest-Common-Prefix Array Computation. In: Hong, SH., Nagamochi, H., Fukunaga, T. (eds) Algorithms and Computation. ISAAC 2008. Lecture Notes in Computer Science, vol 5369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92182-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-92182-0_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92181-3
Online ISBN: 978-3-540-92182-0
eBook Packages: Computer ScienceComputer Science (R0)