On the Foundations of Universal Sequence Prediction

  • Marcus Hutter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3959)

Abstract

Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (non-i.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. We show that Solomonoff’s model possesses many desirable properties: Fast convergence and strong bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the old-evidence and updating problem. It even performs well (actually better) in non-computable environments.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [CB90]
    Clarke, B.S., Barron, A.R.: Information-theoretic asymptotics of Bayes methods. IEEE Transactions on Information Theory 36, 453–471 (1990)MATHCrossRefMathSciNetGoogle Scholar
  2. [CV05]
    Cilibrasi, R., Vitányi, P.M.B.: Clustering by compression. IEEE Trans. Information Theory 51(4), 1523–1545 (2005)CrossRefGoogle Scholar
  3. [Ear93]
    Earman, J.: Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory. MIT Press, Cambridge (1993)Google Scholar
  4. [Hut04]
    Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability, p. 300. Springer, Heidelberg (2004), http://www.idsia.ch/~marcus/ai/uaibook.htm Google Scholar
  5. [KW96]
    Kass, R.E., Wasserman, L.: The selection of prior distributions by formal rules. Journal of the American Statistical Association 91(435), 1343–1370 (1996)MATHCrossRefGoogle Scholar
  6. [LV97]
    Li, M., Vitányi, P.M.B.: An Introduction to Kolmogorov Complexity and its Applications, 2nd edn. Springer, Berlin (1997)MATHGoogle Scholar
  7. [PH04]
    Poland, J., Hutter, M.: On the convergence speed of MDL predictions for Bernoulli sequences. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 294–308. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. [Sch02]
    Schmidhuber, J.: Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit. International Journal of Foundations of Computer Science 13(4), 587–612 (2002)MATHCrossRefMathSciNetGoogle Scholar
  9. [Sch04]
    Schmidhuber, J.: Optimal ordered problem solver. Machine Learning 54(3), 211–254 (2004)MATHCrossRefGoogle Scholar
  10. [Sol64]
    Solomonoff, R.J.: A formal theory of inductive inference: Parts 1 and 2. Information and Control 7(1–22 and 224–254) (1964)MATHCrossRefMathSciNetGoogle Scholar
  11. [Sol78]
    Solomonoff, R.J.: Complexity-based induction systems: Comparisons and convergence theorems. IEEE Transactions on Information Theory IT-24, 422–432 (1978)CrossRefMathSciNetGoogle Scholar
  12. [Wal96]
    Walley, P.: Inferences from multinomial data: learning about a bag of marbles. Journal of the Royal Statistical Society B 58(1), 3–57 (1996)MATHMathSciNetGoogle Scholar
  13. [Wal05]
    Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Springer, Berlin (2005)MATHGoogle Scholar
  14. [ZL70]
    Zvonkin, A.K., Levin, L.A.: The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Mathematical Surveys 25(6), 83–124 (1970)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Marcus Hutter
    • 1
  1. 1.IDSIAManno-LuganoSwitzerland

Personalised recommendations