Skip to main content

On the Convergence Speed of MDL Predictions for Bernoulli Sequences

  • Conference paper
Algorithmic Learning Theory (ALT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3244))

Included in the following conference series:

Abstract

We consider the Minimum Description Length principle for online sequence prediction. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is bounded, implying convergence with probability one, and (b) it additionally specifies a rate of convergence. Generally, for MDL only exponential loss bounds hold, as opposed to the linear bounds for a Bayes mixture. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. The results apply to many Machine Learning tasks including classification and hypothesis testing. We provide arguments that our theorems generalize to countable classes of i.i.d. models.

This work was supported by SNF grant 2100-67712.02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Clarke, B.S., Barron, A.R.: Information-theoretic asymptotics of Bayes methods. IEEE Trans. on Information Theory 36, 453–471 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  2. Rissanen, J.J.: Fisher Information and Stochastic Complexity. IEEE Trans. on Information Theory 42, 40–47 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  3. Barron, A.R., Rissanen, J.J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. on Information Theory 44, 2743–2760 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  4. Barron, A.R., Cover, T.M.: Minimum complexity density estimation. IEEE Trans. on Information Theory 37, 1034–1054 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  5. Solomonoff, R.J.: Complexity-based induction systems: comparisons and convergence theorems. IEEE Trans. Information Theory IT-24, 422–432 (1978)

    Article  MathSciNet  Google Scholar 

  6. Hutter, M.: Convergence and error bounds for universal prediction of nonbinary sequences. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 239–250. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  7. Poland, J., Hutter, M.: Convergence of discrete MDL for sequential prediction. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 300–314. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Vovk, V.G.: Learning about the parameter of the bernoulli model. Journal of Computer and System Sciences 55, 96–104 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  9. Vitányi, P.M., Li, M.: Minimum description length induction, Bayesianism, and Kolmogorov complexity. IEEE Trans. on Information Theory 46, 446–464 (2000)

    Article  MATH  Google Scholar 

  10. Li, M., Vitányi, P.M.B.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, Heidelberg (1997)

    MATH  Google Scholar 

  11. Gács, P.: On the relation between descriptional complexity and algorithmic probability. Theoretical Computer Science 22, 71–93 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  12. Hutter, M.: Sequence prediction based on monotone complexity. In: Proc. 16th Annual Conference on Learning Theory (COLT 2003). LNCS (LNAI), pp. 506–521. Springer, Berlin (2003)

    Google Scholar 

  13. Zvonkin, A.K., Levin, L.A.: The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Mathematical Surveys 25, 83–124 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  14. Hutter, M.: Sequential predictions based on algorithmic complexity. Technical report (2004) IDSIA-16-04

    Google Scholar 

  15. Hutter, M.: Convergence and loss bounds for Bayesian sequence prediction. IEEE Trans. on Information Theory 49, 2061–2067 (2003)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Poland, J., Hutter, M. (2004). On the Convergence Speed of MDL Predictions for Bernoulli Sequences. In: Ben-David, S., Case, J., Maruoka, A. (eds) Algorithmic Learning Theory. ALT 2004. Lecture Notes in Computer Science(), vol 3244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30215-5_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30215-5_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23356-5

  • Online ISBN: 978-3-540-30215-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics