Abstract
If a sequence of random variables has Shannon entropy H, it is well known that there exists an efficient description of this sequence which requires only H bits. But the entropy H of a sequence also has to do with inference. Low entropy sequences allow good guesses of their next terms. This is best illustrated by allowing a gambler to gamble at fair odds on such a sequence. The amount of money that one can make is essentially the complement of the entropy with respect to the length of the sequence.
Now suppose that the sequence is not random. Although the entropy of such a sequence is not defined, there is a notion of its intrinsic descriptive complexity. This idea, put forth by Kolmogorov, Chaitin, and Solomonoff, says that the intrinsic complexity of a sequence is the length of its shortest description. Here too there is a tradeoff between complexity and inference. Low complexity sequences allow a high degree of inference. Again there is a gambling tradeoff.
Finally, it will be shown that if a sequence is random and has entropy H, then with high probability its Kolmogorov complexity will also be H.
Special attention will be given to the so-called Kolmogorov H function, a function that has not yet made its appearance in the literature. We argue that it plays the role of a minimal sufficient statistic. Thus, we can assert that there is a sufficient statistic for the Mona Lisa. This idea will capture the fundamental structure of geometrical patterns, probability distributions and the laws of nature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A.N. Kolmogorov, “Three Approaches to the Concept of the Amount of Information,” Problemy Peredachi Informatsii, 1, (1965), pp. 3–11.
A.K. Zhvonkin and L.A. Levin, “The Complexity of Finite Objects and the Development of the Concepts of Information and Randomness by Means of the Theory of Algorithms,” Russian Mathematical Surveys 25, (1970), pp. 83–124.
C.P. Schnorr, “A Unified Approach to the Definition of Random Sequences,” Math. Systems Theory, 5, No. 3, (1971), pp. 246–258.
R.J. Solomonoff, “A Formal Theory of Inductive Inference, Part I,” Information and Control, 7, (1964), pp. 1–22.
R.J. Solomonoff, “A Formal Theory of Inductive Inference, Part II,” Information and Control, 7, (1964), pp. 224–254.
T. Fine, Theories of Probability, 1974.
T. Cover, “Generalization on Patterns Using Kolmogorov Complexity,” Proc. 1st Internatinal Joint Conference on Pattern Recognition, Washington, D. C. (1973).
T. Cover, “Geometrical and Statistical Properties of Linear Threshold Functions with Applications in Pattern Recognition,” IEEE Trans. Elec. Comp, (1965).
T. Cover and S.K. Leung-Yan-Cheong, “Some Equivalences between Shannon Entropy and Kolmogorov Complexity,” IEEE Trans. on Information Theory, Vol. IT-24, No. 3, May 1978, pp. 331–338.
T. Cover, “Universal Gambling Schemes and the complexity Measures of Kolmogorov and Chaitin,” Technical Report No. 12, (1974) Dept. of Statistics, Stanford University.
G. Chaitin, “A Theory of Program Size Formally Identical to Information Theory,” J. of the Assoc. for Computing Machinery Vol. 22, No. 3, July 1975; pp. 329–341.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1985 Martinus Nijhoff Publishers, Dordrecht
About this chapter
Cite this chapter
Cover, T.M. (1985). Kolmogorov Complexity, Data Compression, and Inference. In: Skwirzynski, J.K. (eds) The Impact of Processing Techniques on Communications. NATO ASI Series, vol 91. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-5113-6_2
Download citation
DOI: https://doi.org/10.1007/978-94-009-5113-6_2
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-8760-5
Online ISBN: 978-94-009-5113-6
eBook Packages: Springer Book Archive