Semi-dynamic Compact Index for Short Patterns and Succinct van Emde Boas Tree

  • Yoshiaki Matsuoka
  • Tomohiro I
  • Shunsuke Inenaga
  • Hideo Bannai
  • Masayuki Takeda
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9133)

Abstract

We present a compact semi-dynamic text index which allows us to find short patterns efficiently. For parameters \(k\le q \le \log _\sigma n - \log _\sigma \log _\sigma n\) and alphabet size \(\sigma = O(\mathrm {polylog}(n))\), all \( occ \) occurrences of a pattern of length at most \(q-k+1\) can be obtained in \(O(k \times occ + \log _\sigma n)\) time, where \(n\) is the length of the text. Adding characters to the end of the text is supported in amortized constant time. Our index requires \((n/k) \log (n/k) + n \log \sigma + o(n)\) bits of space, which is compact (i.e., \(O(n \log \sigma )\)) when \(k = \varTheta (\log _{\sigma } n)\). As a byproduct, we present a succinct van Emde Boas tree which supports insertion, deletion, predecessor, and successor on a dynamic set of integers over the universe \([0, m-1]\) in \(O(\log \log m)\) time and requires only \(m + o(m)\) bits of space.

References

  1. 1.
    Claude, F., Farina, A., Martínez-Prieto, M.A., Navarro, G.: Compressed \(q\)-gram indexing for highly repetitive biological sequences. In: Proceedings of the BIBE 2010, pp. 86–91 (2010)Google Scholar
  2. 2.
    van Emde Boas, P.: Preserving order in a forest in less than logarithmic time. In: FOCS, pp. 75–84. IEEE Computer Society (1975)Google Scholar
  3. 3.
    Ferragina, P., Grossi, R.: Optimal on-line search and sublinear time update in string matching. SIAM J. Comput. 27(3), 713–736 (1998)MATHMathSciNetCrossRefGoogle Scholar
  4. 4.
    Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: FOCS, pp. 390–398. IEEE Computer Society (2000)Google Scholar
  5. 5.
    Gupta, A., Hon, W.K., Shah, R., Vitter, J.S.: Dynamic rank/select dictionaries with applications to XML indexing. Technical report 06–014, Purdue University (2006)Google Scholar
  6. 6.
    Hon, W.K., Lam, T.W., Sadakane, K., Sung, W.K., Yiu, S.M.: Compressed index for dynamic text. In: Data Compression Conference, pp. 102–111 (2004)Google Scholar
  7. 7.
    Leiserson, C.E., Prokop, H., Randall, K.H.: Using de Bruijn sequences to index a 1 in a computer word (1998) (unpublished manuscript)Google Scholar
  8. 8.
    Mäkinen, V., Navarro, G.: Succinct suffix arrays based on run-length encoding. Nord. J. Comput. 12(1), 40–66 (2005)Google Scholar
  9. 9.
    Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)MATHMathSciNetCrossRefGoogle Scholar
  10. 10.
    Navarro, G.: Indexing text using the Ziv-Lempel trie. J. Discret. Algorithms 2(1), 87–114 (2004)MATHCrossRefGoogle Scholar
  11. 11.
    Navarro, G., Nekrich, Y.: Optimal dynamic sequence representations. In: Proceedings of the SODA 2013, pp. 865–876 (2013)Google Scholar
  12. 12.
    Rasmussen, K.R., Stoye, J., Myers, E.W.: Efficient \(q\)-gram filters for finding all \(\epsilon \)-matches over a given length. J. Comput. Biol. 13(2), 296–308 (2006)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Salson, M., Lecroq, T., Léonard, M., Mouchard, L.: Dynamic extended suffix arrays. J. Discret. Algorithms 8(2), 241–257 (2010)MATHCrossRefGoogle Scholar
  14. 14.
    Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)MATHMathSciNetCrossRefGoogle Scholar
  15. 15.
    Weiner, P.: Linear pattern-matching algorithms. In: Proceedings of 14th IEEE Annual Symposium on Switching and Automata Theory, pp. 1–11 (1973)Google Scholar
  16. 16.
    Willard, D.E.: Log-logarithmic worst-case range queries are possible in space \(\Theta (N)\). Inf. Process. Lett. 17(2), 81–84 (1983)MATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Yoshiaki Matsuoka
    • 1
  • Tomohiro I
    • 2
  • Shunsuke Inenaga
    • 1
  • Hideo Bannai
    • 1
  • Masayuki Takeda
    • 1
  1. 1.Department of InformaticsKyushu UniversityFukuokaJapan
  2. 2.Department of Computer ScienceTU DortmundDortmundGermany

Personalised recommendations