Abstract
We address the problem of finding the locations of all instances of a string P in a text T, where of T is allowed to facilitate the queries. Previous data structures for this problem include the suffix tree, the suffix array, and the compact DAWG. We modify a data structure called a sequence tree, which was proposed by Coffman and Eve for hashing, and adapt it to the new problem. We can then produce a list of k occurrences of any string P in T in O(||P|| + k) time. Because of properties shared by suffixes of a text that are not shared by arbitrary hash keys, we can build the structure in O(||T||) time, which is much faster than Coffman and Eve’s algorithm. These bounds are as good as those for the suffix tree, suffix array, and the compact DAWG. The advantages are the elementary nature of some of the algorithms for constructing and using the data structure and the asymptotic bounds we can give for updating the data structure when the text is edited.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Weiner, P.: Linear pattern-matching algorithms. In: Proceedings of the 14th IEEE Annual Symposium on Switching and Automata Theory, pp. 1–11. Institute of Electrical Electronics Engineers, London (1973)
Blumer, A., Blumer, J., Ehrenfeucht, D., Haussler, D., McConnell, R.: Complete inverted files for efficient text retrieval and analysis. Journal of the ACM 34, 578–595 (1987)
Manber, U., Myers, E.: Suffix arrays: a new method for on-line search. SIAM J. Comput. 22, 935–948 (1993)
Ferragina, P., Grossi, R., Montangero, M.: On updating suffix tree labels. Theor. Comput. Sci. 201(1-2), 249–262 (1998)
Salson, M., Lecroq, T., Lonard, M., Mouchard, L.: Dynamic burrows-wheeler transform. Theoretical Computer Science (accepted, 2009)
Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24, 530–536 (1978)
Coffman, E., Eve, J.: File structures using hashing functions. Communications of the ACM 13, 427–432 (1970)
Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms. McGraw-Hill, Boston (2001)
Ehrenfeucht, A., McConnell, R.M.: String searching. In: Mehta, D., Sahni, S. (eds.) Handbook of Data Structures and Applications. CRC Press, Boca Raton (2005)
Tarjan, R.E.: Data structures and network algorithms. Society for Industrial and Applied Math., Philadelphia (1983)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ehrenfeucht, A., McConnell, R.M., Woo, SW. (2009). Contracted Suffix Trees: A Simple and Dynamic Text Indexing Data Structure. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-02441-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02440-5
Online ISBN: 978-3-642-02441-2
eBook Packages: Computer ScienceComputer Science (R0)