Problem Definition
Suppose that a message \( { M=\langle s_1, s_2, \dots, s_n\rangle } \) of length \( { n=|M| } \) symbols is to be represented, where each symbol s i is an integer in the range \( { 1 \leq s_i \leq U } \), for some upper limit U that may or may not be known, and may or may not be finite. Messages in this form are commonly the output of some kind of modeling step in a data compression system. The objective is to represent the message over a binary output alphabet \( { \{ \texttt{0},\texttt{1}\} } \) using as few as possible output bits. A special case of the problem arises when the elements of the message are strictly increasing, \( { s_i < s_{i+1} } \). In this case the message M can be thought of as identifying a subset of \( { \{1, 2, \dots, U\} } \). Examples include storing sets of IP addresses or product codes, and recording the destinations of hyperlinks in the graph representation of the world wide web....
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Anh, V.N., Moffat, A.: Improved word-aligned binary compression for text indexing. IEEE Trans. Knowl. Data Eng. 18(6), 857–861 (2006)
Boldi, P., Vigna, S.: Codes for the world-wide web. Internet Math. 2(4), 405–427 (2005)
Brisaboa, N.R., Fariña, A., Navarro, G., Esteller, M.F.: \( { ({S},{C}) } \)-dense coding: An optimized compression code for natural language text databases. In: Nascimento, M.A. (ed.) Proc. Symp. String Processing and Information Retrieval. LNCS, vol. 2857, pp. 122–136, Manaus, Brazil, October 2003
Chen, D., Chiang, Y.J., Memon, N., Wu, X.: Optimal alphabet partitioning for semi-adaptive coding of sources of unknown sparse distributions. In: Storer, J.A., Cohn, M. (eds.) Proc. 2003 IEEE Data Compression Conference, pp. 372–381, IEEE Computer Society Press, Los Alamitos, California, March 2003
Cheng, C.S., Shann, J.J.J., Chung, C.P.: Unique-order interpolative coding for fast querying and space-efficient indexing in information retrieval systems. Inf. Process. Manag. 42(2), 407–428 (2006)
Culpepper, J.S., Moffat, A.: Enhanced byte codes with restricted prefix properties. In: Consens, M.P., Navarro, G. (eds.) Proc. Symp. String Processing and Information Retrieval. LNCS Volume 3772, pp. 1–12, Buenos Aires, November 2005
de Moura, E.S., Navarro, G., Ziviani, N., Baeza-Yates, R.: Fast and flexible word searching on compressed text. ACM Trans. Inf. Syst. 18(2), 113–139 (2000)
Fenwick, P.: Universal codes. In: Sayood, K. (ed.) Lossless Compression Handbook, pp. 55–78, Academic Press, Boston (2003)
Fraenkel, A.S., Klein, S.T.: Novel compression of sparse bit-strings –Preliminary report. In: Apostolico, A., Galil, Z. (eds) Combinatorial Algorithms on Words, NATO ASI Series F, vol. 12, pp. 169–183. Springer, Berlin (1985)
Gupta, A., Hon, W.K., Shah, R., Vitter, J.S.: Compressed data structures: Dictionaries and data-aware measures. In: Storer, J.A., Cohn, M. (eds) Proc. 16th IEEE Data Compression Conference, pp. 213–222, IEEE, Snowbird, Utah, March 2006 Computer Society, Los Alamitos, CA
Moffat, A., Anh, V.N.: Binary codes for locally homogeneous sequences. Inf. Process. Lett. 99(5), 75–80 (2006) Source code available from www.cs.mu.oz.au/~alistair/rbuc/
Moffat, A., Stuiver, L.: Binary interpolative coding for effective index compression. Inf. Retr. 3(1), 25–47 (2000)
Moffat, A., Turpin, A.: Compression and Coding Algorithms. Kluwer Academic Publishers, Boston (2002)
Raman, R., Raman, V., Srinivasa Rao, S.: Succinct indexable dictionaries with applications to encoding k-ary trees and multisets. In: Proc. 13th ACM-SIAM Symposium on Discrete Algorithms, pp. 233–242, San Francisco, CA, January 2002, SIAM, Philadelphia, PA
Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd edn. Morgan Kaufmann, San Francisco, (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag
About this entry
Cite this entry
Moffat, A. (2008). Compressing Integer Sequences and Sets. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30162-4_84
Download citation
DOI: https://doi.org/10.1007/978-0-387-30162-4_84
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30770-1
Online ISBN: 978-0-387-30162-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering