Abstract
We give a new text compression scheme based on Forbidden Words (“antidictionary”). We prove that our algorithms attain the entropy for balanced binary sources. They run in linear time. Moreover, one of the main advantages of this approach is that it produces very fast decompressors. A second advantage is a synchronization property that is helpful to search compressed data and allows parallel compression. Our algorithms can also be presented as “compilers” that create compressors dedicated to any previously fixed source. The techniques used in this paper are from Information Theory and Finite Automata.
DCA home page at URL http://www-igm.univ-mlv.fr/~mac/DCA.html
Work by this author is supported in part by Programme “Génomes” of C.N.R.S.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A.V. Aho, M.J. Corasick. afficient string matching: an aid to bibliographic search. Comm. ACM 18:6 (1975) 333–340.
R. Ash. Information Theory. Tracts in mathematics, Interscience Publishers, J. Wiley & Sons, 1985.
M.P. Béal. Codage Symbolique. Masson, 1993.
M.-P. Béal, F. Mignosi, A. Restivo. Minimal Forbidden Words and Symbolic Dynamics. in (STACS’96, C. Puech and R. Reischuk, eds., LNCS 1046, Springer, 1996) 555–566.
J. Berstel, D. Perrin. Finite and infinite words. in (Algebraic Combinatorics on Words, J. Berstel, D. Perrin, eds., Cambridge University Press, to appear) Chapter 1. Available at http://www-igm.univ-mlv.fr/berstel.
C. Choffrut, K. Culik. On Extendibility of Unavoidable Sets. Discrete Appl. Math., 9, 1984, 125–137.
T.C. Bell, J.G. Cleary, I.H. Witten. Text Compression. Prentice Hall, 1990.
M. Crochemore, F. Mignosi, A. Restivo. Minimal Forbidden Words and Factor Automata. in (MFCS’98, L. Brim, J. Gruska, J. Slatuška, eds., LNCS 1450, Springer, 1998) 665–673.
M. Crochemore, F. Mignosi, A. Restivo. Automata and Forbidden Words. Information Processing Letters 67 (1998) 111–117.
M. Crochemore, F. Mignosi, A. Restivo, S. Salemi. Search in Compressed Data. in preparation.
M. Crochemore, F. Mignosi, A. Restivo, S. Salemi. A Compressor Compiler. in preparation.
M. Crochemore, W. Rytter. Text Algorithms. Oxford University Press, 1994.
V. Diekert, Y. Kobayashi. Some Identities Related to Automata, Determinants, and Möbius Functions. Report Nr. 1997/05, Universität Stuttgart, Fakultät Informatik, 1997.
R.S. Ellis. Entropy, Large Deviations, and Statistical Mechanics. Springer Verlag, 1985.
J. Gailly. Frequently Asked Questions in data compression, Internet. At the present time available at URL http://www.landfield.com/faqs/compression-faq/
J.G. Kemeny, J.L. Snell. Finite Markov Chains. Van Nostrand Reinhold, 1960.
M. Nelson, J. Gailly. The Data Compression Book. M&T Books, New York, NY, 1996. 2nd edition.
C. Shannon. Prediction and entropy of printed English. Bell System Technical J., 50–64, January, 1951.
J.A. Storer. Data Compression: Methods and Theory. Computer Science Press, Rockville, MD, 1988.
I.H. Witten, A. Moffat, T.C. Bell. Managing Gigabytes. Van Nostrand Reinhold, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Crochemore, M., Mignosi, F., Restivo, A., Salemi, S. (1999). Text Compression Using Antidictionaries. In: Wiedermann, J., van Emde Boas, P., Nielsen, M. (eds) Automata, Languages and Programming. Lecture Notes in Computer Science, vol 1644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48523-6_23
Download citation
DOI: https://doi.org/10.1007/3-540-48523-6_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66224-2
Online ISBN: 978-3-540-48523-0
eBook Packages: Springer Book Archive