# The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions

- 31 Citations
- 678 Downloads

## Abstract

Solomonoff’s optimal but *non*computable method for inductive inference assumes that observation sequences *x* are drawn from an recursive prior distribution *μ*(*x*). Instead of using the unknown *μ*(*x*) he predicts using the celebrated universal enumerable prior *M*(*x*) which for all *x* exceeds any recursive *μ*(*x*), save for a constant factor independent of *x*. The simplicity measure *M*(*x*) naturally implements “Occam’s razor” and is closely related to the Kolmogorov complexity of *x*. However, *M* assigns high probability to certain data *x* that are extremely hard to compute. This does not match our intuitive notion of simplicity. Here we suggest a more plausible measure derived from the fastest way of computing data. In absence of contrarian evidence, we assume that the physical world is generated by a computational process, and that any possibly infinite sequence of observations is therefore computable in the limit (this assumption is more radical and stronger than Solomonoff’s). Then we replace *M* by the novel Speed Prior *S*, under which the cumulative a priori probability of all data whose computation through an optimal algorithm requires more than *O*(*n*) resources is 1/*n*. We show that the Speed Prior allows for deriving a *computable* strategy for optimal prediction of future *y*, given past *x*. Then we consider the case that the data actually stem from a *non*optimal, unknown computational process, and use Hutter’s recent results to derive excellent expected loss bounds for *S*-based inductive inference. We conclude with several nontraditional predictions concerning the future of our universe.

## Keywords

Inductive Inference Minimum Description Length Kolmogorov Complexity Anthropic Principle Universal Turing Machine## Preview

Unable to display preview. Download preview PDF.

## References

- 1.C. H. Bennett and D. P. DiVicenzo. Quantum information and computation.
*Nature*, 404(6775):256–259, 2000.CrossRefGoogle Scholar - 2.H. J. Bremermann. Minimum energy requirements of information transfer and computing.
*International Journal of Theoretical Physics*, 21:203–217, 1982.zbMATHCrossRefMathSciNetGoogle Scholar - 3.B. Carter. Large number coincidences and the anthropic principle in cosmology. In M. S. Longair, editor,
*Proceedings of the IAU Symposium 63*, pages 291–298. Reidel, Dordrecht, 1974.Google Scholar - 4.G. J. Chaitin. On the length of programs for computing finite binary sequences: statistical considerations.
*Journal of the ACM*, 16:145–159, 1969.zbMATHCrossRefMathSciNetGoogle Scholar - 5.H. Everett III. ‘Relative State’ formulation of quantum mechanics.
*Reviews of Modern Physics*, 29:454–462, 1957.CrossRefMathSciNetGoogle Scholar - 6.P. Gács. On the relation between descriptional complexity and algorithmic probability.
*Theoretical Computer Science*, 22:71–93, 1983.zbMATHCrossRefMathSciNetGoogle Scholar - 7.D. F. Galouye.
*Simulacron 3*. Bantam, 1964.Google Scholar - 8.M. Hutter. Convergence and error bounds of universal prediction for general alphabet.
*Proceedings of the 12th European Conference on Machine Learning (ECML-2001)*, (TR IDSIA-07-01, cs.AI/0103015), 2001.Google Scholar - 9.M. Hutter. General loss bounds for universal sequence prediction. In C. E. Brodley and A. P. Danyluk, editors,
*Proceedings of the*18^{th}*International Conference on Machine Learning (ICML-2001)*, pages 210–217. Morgan Kaufmann, 2001. TR IDSIA-03-01, IDSIA, Switzerland, Jan 2001, cs.AI/0101019.Google Scholar - 10.M. Hutter. Towards a universal theory of artificial intelligence based on algorithmic probability and sequential decisions.
*Proceedings of the 12*^{th}*European Conference on Machine Learning (ECML-2001)*, (TR IDSIA-14-00, cs.AI/0012011), 2001.Google Scholar - 11.M. Hutter. The fastest and shortest algorithm for all well-defined problems.
*International Journal of Foundations of Computer Science*, (TR IDSIA-16-00, cs.CC/0102018), 2002. In press.Google Scholar - 12.A. N. Kolmogorov. Three approaches to the quantitative definition of information.
*Problems of Information Transmission*, 1:1–11, 1965.Google Scholar - 13.L. G. Kraft. A device for quantizing, grouping, and coding amplitude modulated pulses. M.Sc. Thesis, Dept. of Electrical Engineering, MIT, Cambridge, Mass., 1949.Google Scholar
- 14.L. A. Levin. Universal sequential search problems.
*Problems of Information Transmission*, 9(3):265–266, 1973.Google Scholar - 15.M. Li and P. M. B. Vitányi.
*An Introduction to Kolmogorov Complexity and its Applications (2nd edition)*. Springer, 1997.Google Scholar - 16.S. Lloyd. Ultimate physical limits to computation.
*Nature*, 406:1047–1054, 2000.CrossRefGoogle Scholar - 17.J. Rissanen. Stochastic complexity and modeling.
*The Annals of Statistics*, 14(3):1080–1100, 1986.zbMATHMathSciNetCrossRefGoogle Scholar - 18.C. Schmidhuber. Strings from logic. Technical Report CERN-TH/2000-316, CERN, Theory Division, 2000. http://xxx.lanl.gov/abs/hep-th/0011065.
- 19.J. Schmidhuber. Discovering solutions with low Kolmogorov complexity and high generalization capability. In A. Prieditis and S. Russell, editors,
*Machine Learning: Proceedings of the Twelfth International Conference*, pages 488–496. Morgan Kaufmann Publishers, San Francisco, CA, 1995.Google Scholar - 20.J. Schmidhuber. A computer scientist’s view of life, the universe, and everything. In C. Freksa, M. Jantzen, and R. Valk, editors,
*Foundations of Computer Science: Potential-Theory-Cognition*, volume 1337, pages 201–208. Lecture Notes in Computer Science, Springer, Berlin, 1997.Google Scholar - 21.J. Schmidhuber. Discovering neural nets with low Kolmogorov complexity and high generalization capability.
*Neural Networks*, 10(5):857–873, 1997.CrossRefGoogle Scholar - 22.J. Schmidhuber. Algorithmic theories of everything. Technical Report IDSIA-20-00, quant-ph/0011122, IDSIA, Manno (Lugano), Switzerland, 2000.Google Scholar
- 23.J. Schmidhuber. Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit.
*International Journal of Foundations of Computer Science*, 2002. In press.Google Scholar - 24.R. J. Solomonoff. A formal theory of inductive inference. Part I.
*Information and Control*, 7:1–22, 1964.CrossRefMathSciNetzbMATHGoogle Scholar - 25.R. J. Solomonoff. Complexity-based induction systems.
*IEEE Transactions on Information Theory*, IT-24(5):422–432, 1978.CrossRefMathSciNetGoogle Scholar - 26.G. ’t Hooft. Quantum gravity as a dissipative deterministic system. Technical Report SPIN-1999/07/gr-gc/9903084, http://xxx.lanl.gov/abs/gr-qc/9903084, Institute for Theoretical Physics, Univ. of Utrecht, and Spinoza Institute, Netherlands, 1999. Also published in
*Classical and Quantum Gravity 16*, 3263.Google Scholar - 27.C. S. Wallace and D. M. Boulton. An information theoretic measure for classification.
*Computer Journal*, 11(2): 185–194, 1968.zbMATHGoogle Scholar - 28.K. Zuse.
*Rechnender Raum*. Friedrich Vieweg & Sohn, Braunschweig, 1969.zbMATHGoogle Scholar - 29.A. K. Zvonkin and L. A. Levin. The complexity of finite objects and the algorithmic concepts of information and randomness.
*Russian Math. Surveys*, 25(6):83–124, 1970.zbMATHCrossRefMathSciNetGoogle Scholar