Suffix Trays and Suffix Trists: Structures for Faster Text Indexing
Suffix trees and suffix arrays are two of the most widely used data structures for text indexing. Each uses linear space and can be constructed in linear time [3,5,6,7]. However, when it comes to answering queries, the prior does so in O(mlog|Σ|) time, where m is the query size, |Σ| is the alphabet size, and the latter does so in O(m+logn), where n is the text size. We propose a novel way of combining the two into, what we call, a suffix tray. The space and construction time remain linear and the query time improves to O(m+log|Σ|).
We also consider the online version of indexing, where the indexing structure continues to update the text online and queries are answered in tandem. Here we suggest a suffix trist, a cross between a suffix tree and a suffix list. It supports queries in O(m+log|Σ|). The space and text update time of a suffix trist are the same as for the suffix tree or the suffix list.
KeywordsInternal Node Indexing Structure Query Time Suffix Tree Binary Search Tree
Unable to display preview. Download preview PDF.
- 1.Amir, A., Kopelowitz, T., Lewenstein, M., Lewenstein, N.: Towards Real-Time Suffix Tree Construction. In: Proc. of Symp. on String Processing and Information Retrieval (SPIRE), pp. 67–78 (2005)Google Scholar
- 3.Farach, M.: Optimal suffix tree construction with large alphabets. In: Proc. 38th IEEE Symposium on Foundations of Computer Science, pp. 137–143 (1997)Google Scholar
- 11.Weiner, P.: Linear pattern matching algorithm. In: Proc. 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)Google Scholar