Convergence of Discrete MDL for Sequential Prediction
We study the properties of the Minimum Description Length principle for sequence prediction, considering a two-part MDL estimator which is chosen from a countable class of models. This applies in particular to the important case of universal sequence prediction, where the model class corresponds to all algorithms for some fixed universal Turing machine (this correspondence is by enumerable semimeasures, hence the resulting models are stochastic). We prove convergence theorems similar to Solomonoff’s theorem of universal induction, which also holds for general Bayes mixtures. The bound characterizing the convergence speed for MDL predictions is exponentially larger as compared to Bayes mixtures. We observe that there are at least three different ways of using MDL for prediction. One of these has worse prediction properties, for which predictions only converge if the MDL estimator stabilizes. We establish sufficient conditions for this to occur. Finally, some immediate consequences for complexity relations and randomness criteria are proven.
Unable to display preview. Download preview PDF.
- 3.Grünwald, P.D.: The Minimum Discription Length Principle and Reasoning under Uncertainty. PhD thesis, Universiteit van Amsterdam (1998) Google Scholar
- 15.Hutter, M.: Sequence prediction based on monotone complexity. In: Proceedings of the 16th Annual Conference on Learning Theory (COLT 2003). LNCS (LNAI), pp. 506–521. Springer, Berlin (2003)Google Scholar
- 17.Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.M.B.: The similarity metric. In: Proc. 14th ACM-SIAM Symposium on Discrete Algorithms, SODA (2003)Google Scholar
- 18.Poland, J., Hutter, M.: On the convergence speed of MDL predictions for Bernoulli sequences. (2004) (preprint) Google Scholar
- 21.Schnorr, C.P.: Zufälligkeit und Wahrscheinlichkeit. In: Lecture Notes in Mathematics, vol. 218, Springer, Chichester (1971)Google Scholar
- 22.Wang, Y.: Randomness and Complexity. PhD thesis, Ruprecht-Karls-Universität Heidelberg (1996)Google Scholar