Dynamic Entropy-Compressed Sequences and Full-Text Indexes

  • Veli Mäkinen
  • Gonzalo Navarro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4009)


Given a sequence of n bits with binary zero-order entropy H 0, we present a dynamic data structure that requires nH 0 + o(n) bits of space, which is able of performing rank and select, as well as inserting and deleting bits at arbitrary positions, in O(logn) worst-case time. This extends previous results by Hon et al. [ISAAC 2003] achieving O(logn/loglogn) time for rank and select but \(\Theta({\textrm{polylog}}(n))\) amortized time for inserting and deleting bits, and requiring n + o(n) bits of space; and by Raman et al. [SODA 2002] which have constant query time but a static structure. In particular, our result becomes the first entropy-bound dynamic data structure for rank and select over bit sequences.

We then show how the above result can be used to build a dynamic full-text self-index for a collection of texts over an alphabet of size σ, of overall length n and zero-order entropy H 0. The index requires nH 0 + o(n logσ) bits of space, and can count the number of occurrences of a pattern of length m in time O(m logn logσ). Reporting the occ occurrences can be supported in O(occ log2 n logσ) time, paying O(n) extra space. Insertion of text to the collection takes O(logn logσ) time per symbol, which becomes O(log2 n logσ) for deletions. This improves a previous result by Chan et al. [CPM 2004]. As a consequence, we obtain an O(n logn logσ) time construction algorithm for a compressed self-index requiring nH 0 + o(n logσ) bits working space during construction.


Extra Space Wavelet Tree Rank Query Dynamic Data Structure Select Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Apostolico, A.: The myriad virtues of subword trees. In: Combinatorial Algorithms on Words. NATO ISI Series, pp. 85–96. Springer, Heidelberg (1985)Google Scholar
  2. 2.
    Arroyuelo, D., Navarro, G.: Space-efficient construction of LZ-index. In: Deng, X., Du, D.-Z. (eds.) ISAAC 2005. LNCS, vol. 3827, pp. 1143–1152. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)Google Scholar
  4. 4.
    Chan, W.-L., Hon, W.-K., Lam, T.-W.: Compressed index for a dynamic collection of texts. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 445–456. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Dietz, P.: Optimal algorithms for list indexing and subset rank. In: Dehne, F., Santoro, N., Sack, J.-R. (eds.) WADS 1989. LNCS, vol. 382, pp. 39–46. Springer, Heidelberg (1989)Google Scholar
  6. 6.
    Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proc. FOCS 2000, pp. 390–398 (2000)Google Scholar
  7. 7.
    Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representation of sequences and full-text indexes. ACM Transactions on Algorithms (to appear, 2006); Preliminary versions, In: Proc. SPIRE 2004 and Tech. Rep. TR/DCC-2004-5, Dept. of Computer Science Univ. of Chile, ftp://ftp.dcc.uchile.cl/pub/users/gnavarro/sequences.ps.gz
  8. 8.
    Grossi, R., Gupta, A., Vitter, J.: High-order entropy-compressed text indexes. In: Proc. SODA 2003, pp. 841–850 (2003)Google Scholar
  9. 9.
    Hon, W.-K., Sadakane, K., Sung, W.-K.: Succinct data structures for searchable partial sums. In: Ibaraki, T., Katoh, N., Ono, H. (eds.) ISAAC 2003. LNCS, vol. 2906, pp. 505–516. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Mäkinen, V., Navarro, G.: Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing 12(1), 40–66 (2005)MathSciNetGoogle Scholar
  11. 11.
    Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. SIAM Journal on Computing, 935–948 (1993)Google Scholar
  12. 12.
    Navarro, G.: Indexing text using the Ziv-Lempel trie. Journal of Discrete Algorithms (JDA) 2(1), 87–114 (2004)CrossRefMATHGoogle Scholar
  13. 13.
    Raman, R., Raman, V., Srinivasa Rao, S.: Succinct dynamic data structures. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 426–437. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  14. 14.
    Raman, R., Raman, V., Srinivasa Rao, S.: Succinct indexable dictionaries with applications to encoding k-ary trees and multisets. In: Proc. SODA 2002, pp. 233–242 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Veli Mäkinen
    • 1
  • Gonzalo Navarro
    • 2
  1. 1.Department of Computer ScienceUniversity of HelsinkiFinland
  2. 2.Department of Computer ScienceUniversity of Chile 

Personalised recommendations