Skip to main content
Log in

Ultimate Cognition à la Gödel

  • Published:
Cognitive Computation Aims and scope Submit manuscript


“All life is problem solving,” said Popper. To deal with arbitrary problems in arbitrary environments, an ultimate cognitive agent should use its limited hardware in the “best” and “most efficient” possible way. Can we formally nail down this informal statement, and derive a mathematically rigorous blueprint of ultimate cognition? Yes, we can, using Kurt Gödel’s celebrated self-reference trick of 1931 in a new way. Gödel exhibited the limits of mathematics and computation by creating a formula that speaks about itself, claiming to be unprovable by an algorithmic theorem prover: either the formula is true but unprovable, or math itself is flawed in an algorithmic sense. Here we describe an agent-controlling program that speaks about itself, ready to rewrite itself in arbitrary fashion once it has found a proof that the rewrite is useful according to a user-defined utility function. Any such a rewrite is necessarily globally optimal—no local maxima!—since this proof necessarily must have demonstrated the uselessness of continuing the proof search for even better rewrites. Our self-referential program will optimally speed up its proof searcher and other program parts, but only if the speed up’s utility is indeed provable—even ultimate cognition has limits of the Gödelian kind.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others


  1. Or `Goedel machine’, to avoid the Umlaut. But ‘Godel machine’ would not be quite correct. Not to be confused with what Penrose calls, in a different context, ‘Gödel’s putative theorem-proving machine’ [36]!

  2. Turing reformulated Gödel’s unprovability results in terms of TMs [74] which subsequently became the most widely used abstract model of computation. It is well known that there are universal TMs that in a certain sense can emulate any other TM or any other known computer. Gödel’s integer-based formal language can be used to describe any universal TM, and vice versa.

  3. We see that certain parts of the current s may not be directly observable without changing the observable itself. Sometimes, however, axioms and previous observations will allow the Gödel machine to deduce time-dependent storage contents that are not directly observable. For instance, by analyzing the code being executed through instruction pointer IP in the example above, the value of IP at certain times may be predictable (or postdictable, after the fact). The values of other variables at given times, however, may not be deducible at all. Such limits of self-observability are reminiscent of Heisenberg’s celebrated uncertainty principle [16], which states that certain physical measurements are necessarily imprecise, since the measuring process affects the measured quantity.


  1. Aleksander I. The world in my mind, my mind in the world: key mechanisms of consciousness in humans, animals and machines. Exeter: Imprint Academic; 2005.

    Google Scholar 

  2. Baars B, Gage NM. Cognition, brain and consciousness: an introduction to cognitive neuroscience. London: Elsevier/Academic Press; 2007.

    Google Scholar 

  3. Banzhaf W, Nordin P, Keller RE, Francone FD. Genetic programming—an introduction. San Francisco, CA: Morgan Kaufmann Publishers; 1998.

    Google Scholar 

  4. Bellman R. Adaptive control processes. NY: Princeton University Press; 1961.

    Google Scholar 

  5. Blum M. A machine-independent theory of the complexity of recursive functions. J ACM. 1967;14(2):322–36.

    Article  Google Scholar 

  6. Blum M. On effective procedures for speeding up algorithms. J ACM. 1971; 18(2):290–305.

    Article  Google Scholar 

  7. Butz M. How and why the brain lays the foundations for a conscious self. Constructivist Found. 2008; 4(1):1–14.

    Google Scholar 

  8. Cantor G. Über eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen. Crelle’s Journal für Mathematik 1874; 77:258–63.

    Google Scholar 

  9. Chaitin GJ. A theory of program size formally identical to information theory. J ACM. 1975; 22:329–40.

    Article  Google Scholar 

  10. Clocksin WF, Mellish CS. Programming in Prolog. 3rd ed. NY: Springer-Verlag; 1987.

    Google Scholar 

  11. Cramer NL. A representation for the adaptive generation of simple sequential programs. In: Grefenstette, JJ, editor, Proceedings of an international conference on genetic algorithms and their applications, Carnegie-Mellon University, July 24–26. Hillsdale, NJ: Lawrence Erlbaum Associates; 1985.

  12. Crick F, Koch C. Consciousness and neuroscience. Cerebral Cortex. 1998;8:97–107.

    Article  PubMed  CAS  Google Scholar 

  13. Fitting MC. First-order logic and automated theorem proving. Graduate texts in computer science. 2nd ed. Berlin: Springer-Verlag; 1996.

    Google Scholar 

  14. Gödel K. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik 1931;38:173–98.

    Article  Google Scholar 

  15. Haikonen P. The cognitive approach to conscious machines. London: Imprint Academic; 2003.

    Google Scholar 

  16. Heisenberg W. Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik. Zeitschrift für Physik 1925;33:879–93.

    Article  CAS  Google Scholar 

  17. Hochreiter S, Younger AS, Conwell PR. Learning to learn using gradient descent. In Lecture Notes on Computer Science 2130, Proceedings of the international conference on artificial neural networks (ICANN-2001). Heidelberg: Springer; 2001. p. 87–94.

  18. Hofstadter DR. Gödel, Escher, Bach: an eternal golden braid. NY: Basic Books; 1979.

    Google Scholar 

  19. Holland JH. Properties of the bucket brigade. In: Proceedings of an international conference on genetic algorithms. Hillsdale, NJ: Lawrence Erlbaum; 1985.

  20. Hutter M. Towards a universal theory of artificial intelligence based on algorithmic probability and sequential decisions. In: Proceedings of the 12th European conference on machine learning (ECML-2001); 2001. p. 226–38 (On J. Schmidhuber’s SNF grant 20-61847).

  21. Hutter M. The fastest and shortest algorithm for all well-defined problems. Int J Found Comput Sci. 2002;13(3):431–43 (On J. Schmidhuber’s SNF grant 20-61847).

    Article  Google Scholar 

  22. Hutter M. Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures. In: Kivinen J and Sloan RH, editors. Proceedings of the 15th annual conference on computational learning theory (COLT 2002), Lecture Notes in Artificial Intelligence. Sydney, Australia: Springer; 2002. p. 364–79 (On J. Schmidhuber’s SNF grant 20-61847).

  23. Hutter M. Universal artificial Intelligence: sequential decisions based on algorithmic probability. Berlin: Springer; 2004.

    Google Scholar 

  24. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J AI Res. 1996;4:237–85.

    Google Scholar 

  25. Kolmogorov AN. Grundbegriffe der Wahrscheinlichkeitsrechnung. Berlin: Springer; 1933.

    Google Scholar 

  26. Kolmogorov AN. Three approaches to the quantitative definition of information. Probl Inf Transm. 1965;1:1–11.

    Google Scholar 

  27. Lenat D. Theory formation by heuristic search. Mach Learn. 1983;21.

  28. Levin LA. Universal sequential search problems. Probl Inf Transm. 1973;9(3):265–66.

    Google Scholar 

  29. Levin LA. Laws of information (nongrowth) and aspects of the foundation of probability theory. Probl Inf Transm. 1974;10(3):206–10.

    Google Scholar 

  30. Levin LA. Randomness conservation inequalities: information and independence in mathematical theories. Inf Control. 1984;61:15–37.

    Article  Google Scholar 

  31. Li M, Vitányi PMB. An introduction to Kolmogorov complexity and its applications. 2nd ed. NY: Springer; 1997.

    Google Scholar 

  32. Löwenheim L. Über Möglichkeiten im Relativkalkül. Mathematische Annalen. 1915;76:447–70.

    Article  Google Scholar 

  33. Mitchell T. Machine learning. NY: McGraw Hill; 1997.

    Google Scholar 

  34. Moore CH, Leach GC. FORTH—a language for interactive computing. Amsterdam: Mohasco Industries Inc.; 1970.

    Google Scholar 

  35. Newell A, Simon H. GPS, a program that simulates human thought. In: Feigenbaum E, Feldman J, editors. Computers and thought. New York: McGraw-Hill; 1963. p. 279–93.

    Google Scholar 

  36. Penrose R. Shadows of the mind. Oxford: Oxford University Press; 1994.

    Google Scholar 

  37. Popper KR. All life is problem solving. London: Routledge; 1999.

    Google Scholar 

  38. Rice HG. Classes of recursively enumerable sets and their decision problems. Trans Am Math Soc. 1953;74:358–66.

    Article  Google Scholar 

  39. Rosenbloom PS, Laird JE, Newell A. The SOAR papers. Cambridge: MIT Press; 1993.

    Google Scholar 

  40. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3:210–29.

    Article  Google Scholar 

  41. Schmidhuber J. Evolutionary principles in self-referential learning. Diploma thesis, Institut für Informatik, Technische Universität München; 1987.

  42. Schmidhuber J. Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem. Dissertation, Institut für Informatik, Technische Universität München; 1990.

  43. Schmidhuber J. Reinforcement learning in Markovian and non-Markovian environments. In: Lippman DS, Moody JE, Touretzky DS, editors. Advances in neural information processing systems 3 (NIPS 3). San Francisco, CA: Morgan Kaufmann; 1991. p. 500–6.

    Google Scholar 

  44. Schmidhuber J. A self-referential weight matrix. In: Proceedings of the international conference on artificial neural networks. Amsterdam: Springer; 1993. p. 446–51.

  45. Schmidhuber J. Discovering solutions with low Kolmogorov complexity and high generalization capability. In: Prieditis A, Russell S, editors. Machine learning: Proceedings of the twelfth international conference. San Francisco, CA: Morgan Kaufmann Publishers; 1995. p. 488–96.

  46. Schmidhuber J. A computer scientist’s view of life, the universe, and everything. In: Freksa C, Jantzen M, Valk R, editors. Foundations of computer science: potential-theory-cognition, vol 1337. Lecture Notes in Computer Science. Berlin: Springer; 1997. p. 201–8.

  47. Schmidhuber J. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Netw. 1997;10(5):857–73.

    Article  PubMed  Google Scholar 

  48. Schmidhuber J. Algorithmic theories of everything. Technical Report IDSIA-20-00, quant-ph/0011122, IDSIA, Manno (Lugano), Switzerland. Sections 1–5: see [50]; Section 6: see [51]; 2000.

  49. Schmidhuber J. Sequential decision making based on direct search. In: Sun R, Giles CL, editors. Sequence learning: paradigms, algorithms, and applications. Lecture Notes on AI 1828. Berlin: Springer; 2001.

  50. Schmidhuber J. Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit. Int J Found Comput Sci. 2002;13(4):587–612.

    Article  Google Scholar 

  51. Schmidhuber J. The speed prior: a new simplicity measure yielding near-optimal computable predictions. In: Kivinen J, Sloan RH, editors. Proceedings of the 15th annual conference on computational learning theory (COLT 2002). Lecture Notes in Artificial Intelligence. Sydney, Australia: Springer; 2002. p. 216–28.

  52. Schmidhuber J. Bias-optimal incremental problem solving. In: Becker S, Thrun S, Obermayer K, editors. Advances in neural information processing systems 15 (NIPS 15). Cambridge, MA: MIT Press; 2003. p. 1571–8.

    Google Scholar 

  53. Schmidhuber J. Towards solving the grand problem of AI. In: Quaresma P, Dourado A, Costa E, Costa JF, editors. Soft computing and complex systems. Coimbra, Portugal: Centro Internacional de Mathematica; 2003. p. 77–97. Based on [58].

  54. Schmidhuber J. Optimal ordered problem solver. Mach Learn. 2004;54:211–54

    Article  Google Scholar 

  55. Schmidhuber J. Completely self-referential optimal reinforcement learners. In: Duch W, Kacprzyk J, Oja E, Zadrozny S, editors. Artificial neural networks: biological inspirations—ICANN 2005. LNCS 3697. Berlin, Heidelberg: Springer-Verlag. 2005. p. 223–33 (Plenary talk).

  56. Schmidhuber J. Gödel machines: towards a technical justification of consciousness. In: Kudenko D, Kazakov D, Alonso E, editors. Adaptive agents and multi-agent systems III. LNCS 3394. Berlin: Springer Verlag; 2005. p. 1–23.

  57. Schmidhuber J. Gödel machines: fully self-referential optimal universal self-improvers. In: Goertzel B, Pennachin C, editors. Artificial general intelligence. Berlin: Springer Verlag; 2006. p. 199–226. Preprint available as arXiv:cs.LO/0309048.

  58. Schmidhuber J. The new AI: general & sound & relevant for physics. In: Goertzel B, Pennachin C, editors. Artificial general intelligence. Berlin: Springer; 2006. p. 175–98. Also available as TR IDSIA-04-03, arXiv:cs.AI/0302012.

  59. Schmidhuber J. Randomness in physics. Nature. 2006;439(3):392 (Correspondence).

    Article  PubMed  CAS  Google Scholar 

  60. Schmidhuber J. 2006: Celebrating 75 years of AI—history and outlook: the next 25 years. In: Lungarella M, Iida F, Bongard J, Pfeifer R, editors. 50 Years of artificial intelligence, vol LNAI 4850. Berlin, Heidelberg: Springer; 2007. p. 29–41.

  61. Schmidhuber J. Alle berechenbaren Universen (All computable universes). Spektrum der Wissenschaft Spezial (German edition of Scientific American) 2007;(3):75–9.

    Google Scholar 

  62. Schmidhuber J. New millennium AI and the convergence of history. In: Duch W, Mandziuk J, editors. Challenges to computational intelligence, vol 63. Studies in Computational Intelligence, Springer; 2007. p. 15–36. Also available as arXiv:cs.AI/0606081.

  63. Schmidhuber J, Zhao J, Schraudolph N. Reinforcement learning with self-modifying policies. In: Thrun S, Pratt L, editors. Learning to learn. Netherland: Kluwer; 1997. p. 293–309.

  64. Schmidhuber J, Zhao J, Wiering M. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Mach Learn. 1997;28:105–30.

    Article  Google Scholar 

  65. Schmidhuber J, Graves A, Gomez F, Fernandez S, Hochreiter S. How to learn programs with artificial recurrent neural networks. Cambridge: Cambridge University Press; 2009 (in preparation).

  66. Seth AK, Izhikevich E, Reeke GN, and Edelman GM. Theories and measures of consciousness: an extended framework. Proc Natl Acad Sci USA. 2006;103:10799–804.

    Article  PubMed  CAS  Google Scholar 

  67. Siegelmann HT, Sontag ED. Turing computability with neural nets. Appl Math Lett. 1991;4(6):77–80.

    Article  Google Scholar 

  68. Skolem T. Logisch-kombinatorische Untersuchungen über Erfüllbarkeit oder Beweisbarkeit mathematischer Sätze nebst einem Theorem über dichte Mengen. Skrifter utgit av Videnskapsselskapet in Kristiania, I, Mat.-Nat. Kl., N; 1919;4:1–36.

    Google Scholar 

  69. Sloman A, Chrisley RL. Virtual machines and consciousness. J Conscious Stud 2003;10(4–5):113–72.

    Google Scholar 

  70. Solomonoff RJ. A formal theory of inductive inference. Part I. Inf Control. 1964;7:1–22.

    Article  Google Scholar 

  71. Solomonoff RJ. Complexity-based induction systems. IEEE Trans Inf Theory. 1978;IT-24(5):422–32.

    Article  Google Scholar 

  72. Solomonoff RJ. Progress in incremental machine learning—preliminary report for NIPS 2002 workshop on universal learners and optimal search; revised September 2003. Technical Report IDSIA-16-03, Lugano: IDSIA; 2003.

  73. Sutton R, Barto A. Reinforcement learning: an introduction. Cambridge, MA: MIT Press; 1998.

    Google Scholar 

  74. Turing AM. On computable numbers, with an application to the Entscheidungsproblem. Proc Lond Math Soc Ser 2. 1936;41:230–67.

    Google Scholar 

  75. Utgoff P. Shift of bias for inductive concept learning. In: Michalski R, Carbonell J, Mitchell T, editors. Machine learning, vol 2. Los Altos, CA: Morgan Kaufmann; 1986. p. 163–90.

    Google Scholar 

  76. Wolpert DH, Macready WG. No free lunch theorems for search. IEEE Trans Evolution Comput. 1997; 1.

  77. Zuse K. Rechnender Raum. Elektronische Datenverarbeitung 1967;8:336–44.

    Google Scholar 

  78. Zuse K. Rechnender Raum. Friedrich Vieweg & Sohn, Braunschweig, 1969. [English translation: Calculating Space]. MIT Technical Translation AZT-70-164-GEMIT. Cambridge, MA: Massachusetts Institute of Technology (Proj. MAC); 1970.

Download references


Thanks to Alexey Chernov, Marcus Hutter, Jan Poland, Ray Solomonoff, Sepp Hochreiter, Shane Legg, Leonid Levin, Alex Graves, Matteo Gagliolo, Viktor Zhumatiy, Ben Goertzel, Will Pearson, and Faustino Gomez for useful comments on drafts or summaries or earlier versions of this article. I am also grateful to many others who asked questions during Gödel machine talks or sent comments by email.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jürgen Schmidhuber.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schmidhuber, J. Ultimate Cognition à la Gödel. Cogn Comput 1, 177–193 (2009).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: