Skip to main content

Improved Address-Calculation Coding of Integer Arrays

  • Conference paper
String Processing and Information Retrieval (SPIRE 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7608))

Included in the following conference series:

Abstract

In this paper we deal with compressed integer arrays that are equipped with fast random access. Our treatment improves over an earlier approach that used address-calculation coding to locate the elements and supported access and search operations in \(O(\lg (n+s))\) time for a sequence of n non-negative integers summing up to s. The idea is to complement the address-calculation method with index structures that considerably decrease access times and also enable updates. For all our structures the memory usage is \(n \lg(1 + s/n) + O(n)\) bits. First a read-only version is introduced that supports rank-based accesses to elements and retrievals of prefix sums in \(O(\lg \lg (n+s)\)) time, as well as prefix-sum searches in \(O(\lg n+ \lg \lg s)\) time, using the word RAM as the model of computation. The second version of the data structure supports accesses in \(O(\lg\lg U)\) time and changes of element values in \(O(\lg^2 U)\) time, where U is the universe size. Both versions performed quite well in practical experiments. A third extension to dynamic arrays is also described, supporting accesses and prefix-sum searches in \(O(\lg n + \lg\lg U)\) time, and insertions and deletions in \(O(\lg^2 U)\) time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22(2), 207–216 (1993)

    Article  Google Scholar 

  2. Brisaboa, N.R., Ladra, S., Navarro, G.: Directly Addressable Variable-Length Codes. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 122–130. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  3. Brodnik, A.: Computation of the least significant set bit. In: Proc. Electrotechnical and Comput. Sci. Conf., vol. B, pp. 7–10 (1993)

    Google Scholar 

  4. Culpepper, J.S., Moffat, A.: Compact Set Representation for Information Retrieval. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 137–148. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Delpratt, O., Rahman, N., Raman, R.: Compressed Prefix Sums. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 235–247. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Fenwick, P.M.: A new data structure for cumulative frequency tables. Software Pract. Exper. 24(3), 327–336 (1994)

    Article  Google Scholar 

  7. Ferragina, P., Venturini, R.: A simple storage scheme for strings achieving entropy bounds. Theoret. Comput. Sci. 372(1), 115–121 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  8. González, R., Navarro, G.: Statistical Encoding of Succinct Data Structures. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 294–305. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Gupta, A., Hon, W.K., Shah, R., Vitter, J.S.: Compressed data structures: Dictionaries and data-aware measures. Theoret. Comput. Sci. 387(3), 313–331 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hagerup, T.: Sorting and Searching on the Word RAM. In: Meinel, C., Morvan, M. (eds.) STACS 1998. LNCS, vol. 1373, pp. 366–398. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  11. Jansson, J., Sadakane, K., Sung, W.K.: CRAM: Compressed random access memory. E-print arXiv:1011.1708v2, arXiv.org, Ithaca (2012)

    Google Scholar 

  12. Katajainen, J., Rao, S.S.: A compact data structure for representing a dynamic multiset. Inform. Process. Lett. 110(23), 1061–1066 (2010)

    Article  MathSciNet  Google Scholar 

  13. Moffat, A.: Compressing integer sequences and sets. In: Encyclopedia of Algorithms, pp. 178–182. Springer Science+Business Media, LLC, New York (2008)

    Google Scholar 

  14. Moffat, A., Stuiver, L.: Binary interpolative coding for effective index compression. Inform. Retrieval 3(1), 25–47 (2000)

    Article  Google Scholar 

  15. Moffat, A., Zobel, J.: Self-indexing inverted files for fast text retrieval. ACM Trans. Inform. Syst. 14(4), 349–379 (1996)

    Article  Google Scholar 

  16. Raman, R., Raman, V., Rao, S.S.: Succinct Dynamic Data Structures. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 426–437. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  17. Raman, R., Raman, V., Satti, S.R.: Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans. Algorithms 3(4), 43:1–43:25 (2007)

    Google Scholar 

  18. Sadakane, K., Grossi, R.: Squeezing succinct data structures into entropy bounds. In: Proc. 17th Annual ACM-SIAM Symp. on Discrete Algorithms, pp. 1230–1239. ACM/SIAM, New York/Philadelphia (2006)

    Chapter  Google Scholar 

  19. Teuhola, J.: Interpolative coding of integer sequences supporting log-time random access. Inform. Process. Manag. 47(5), 742–761 (2011)

    Article  Google Scholar 

  20. Transier, F., Sanders, P.: Engineering basic algorithms of an in-memory text search engine. ACM Trans. Inform. Syst. 29(1), 2:1–2:37 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Elmasry, A., Katajainen, J., Teuhola, J. (2012). Improved Address-Calculation Coding of Integer Arrays. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds) String Processing and Information Retrieval. SPIRE 2012. Lecture Notes in Computer Science, vol 7608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34109-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34109-0_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34108-3

  • Online ISBN: 978-3-642-34109-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics