On the Convergence Speed of MDL Predictions for Bernoulli Sequences

Poland, Jan; Hutter, Marcus

doi:10.1007/978-3-540-30215-5_23

Jan Poland²¹ &
Marcus Hutter²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3244))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

493 Accesses
7 Citations

Abstract

We consider the Minimum Description Length principle for online sequence prediction. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is bounded, implying convergence with probability one, and (b) it additionally specifies a rate of convergence. Generally, for MDL only exponential loss bounds hold, as opposed to the linear bounds for a Bayes mixture. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. The results apply to many Machine Learning tasks including classification and hypothesis testing. We provide arguments that our theorems generalize to countable classes of i.i.d. models.

This work was supported by SNF grant 2100-67712.02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Clarke, B.S., Barron, A.R.: Information-theoretic asymptotics of Bayes methods. IEEE Trans. on Information Theory 36, 453–471 (1990)
Article MATH MathSciNet Google Scholar
Rissanen, J.J.: Fisher Information and Stochastic Complexity. IEEE Trans. on Information Theory 42, 40–47 (1996)
Article MATH MathSciNet Google Scholar
Barron, A.R., Rissanen, J.J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. on Information Theory 44, 2743–2760 (1998)
Article MATH MathSciNet Google Scholar
Barron, A.R., Cover, T.M.: Minimum complexity density estimation. IEEE Trans. on Information Theory 37, 1034–1054 (1991)
Article MATH MathSciNet Google Scholar
Solomonoff, R.J.: Complexity-based induction systems: comparisons and convergence theorems. IEEE Trans. Information Theory IT-24, 422–432 (1978)
Article MathSciNet Google Scholar
Hutter, M.: Convergence and error bounds for universal prediction of nonbinary sequences. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 239–250. Springer, Heidelberg (2001)
Chapter Google Scholar
Poland, J., Hutter, M.: Convergence of discrete MDL for sequential prediction. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 300–314. Springer, Heidelberg (2004)
Chapter Google Scholar
Vovk, V.G.: Learning about the parameter of the bernoulli model. Journal of Computer and System Sciences 55, 96–104 (1997)
Article MATH MathSciNet Google Scholar
Vitányi, P.M., Li, M.: Minimum description length induction, Bayesianism, and Kolmogorov complexity. IEEE Trans. on Information Theory 46, 446–464 (2000)
Article MATH Google Scholar
Li, M., Vitányi, P.M.B.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, Heidelberg (1997)
MATH Google Scholar
Gács, P.: On the relation between descriptional complexity and algorithmic probability. Theoretical Computer Science 22, 71–93 (1983)
Article MATH MathSciNet Google Scholar
Hutter, M.: Sequence prediction based on monotone complexity. In: Proc. 16th Annual Conference on Learning Theory (COLT 2003). LNCS (LNAI), pp. 506–521. Springer, Berlin (2003)
Google Scholar
Zvonkin, A.K., Levin, L.A.: The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Mathematical Surveys 25, 83–124 (1970)
Article MATH MathSciNet Google Scholar
Hutter, M.: Sequential predictions based on algorithmic complexity. Technical report (2004) IDSIA-16-04
Google Scholar
Hutter, M.: Convergence and loss bounds for Bayesian sequence prediction. IEEE Trans. on Information Theory 49, 2061–2067 (2003)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

IDSIA, Galleria 2, CH-6928, Manno (Lugano), Switzerland
Jan Poland & Marcus Hutter

Authors

Jan Poland
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Hutter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

David R. Cheriton School of Computer Science University of Waterloo,
Shoham Ben-David
Department of Computer & Information Sciences, University of Delaware, 103 Smith Hall, DE 19716, Newark
John Case
Dept. of Information Technology and Electronics, Ishinomaki Senshu University,
Akira Maruoka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Poland, J., Hutter, M. (2004). On the Convergence Speed of MDL Predictions for Bernoulli Sequences. In: Ben-David, S., Case, J., Maruoka, A. (eds) Algorithmic Learning Theory. ALT 2004. Lecture Notes in Computer Science(), vol 3244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30215-5_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-30215-5_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23356-5
Online ISBN: 978-3-540-30215-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics