Cognitive Computation

, Volume 1, Issue 2, pp 177–193 | Cite as

Ultimate Cognition à la Gödel

Article

Abstract

“All life is problem solving,” said Popper. To deal with arbitrary problems in arbitrary environments, an ultimate cognitive agent should use its limited hardware in the “best” and “most efficient” possible way. Can we formally nail down this informal statement, and derive a mathematically rigorous blueprint of ultimate cognition? Yes, we can, using Kurt Gödel’s celebrated self-reference trick of 1931 in a new way. Gödel exhibited the limits of mathematics and computation by creating a formula that speaks about itself, claiming to be unprovable by an algorithmic theorem prover: either the formula is true but unprovable, or math itself is flawed in an algorithmic sense. Here we describe an agent-controlling program that speaks about itself, ready to rewrite itself in arbitrary fashion once it has found a proof that the rewrite is useful according to a user-defined utility function. Any such a rewrite is necessarily globally optimal—no local maxima!—since this proof necessarily must have demonstrated the uselessness of continuing the proof search for even better rewrites. Our self-referential program will optimally speed up its proof searcher and other program parts, but only if the speed up’s utility is indeed provable—even ultimate cognition has limits of the Gödelian kind.

Keywords

Universal cognitive systems Ultimate cognition  Optimal general problem solver Self-reference  Goedel machine Global optimality theorem  AI becoming a formal science 

References

  1. 1.
    Aleksander I. The world in my mind, my mind in the world: key mechanisms of consciousness in humans, animals and machines. Exeter: Imprint Academic; 2005.Google Scholar
  2. 2.
    Baars B, Gage NM. Cognition, brain and consciousness: an introduction to cognitive neuroscience. London: Elsevier/Academic Press; 2007.Google Scholar
  3. 3.
    Banzhaf W, Nordin P, Keller RE, Francone FD. Genetic programming—an introduction. San Francisco, CA: Morgan Kaufmann Publishers; 1998.Google Scholar
  4. 4.
    Bellman R. Adaptive control processes. NY: Princeton University Press; 1961.Google Scholar
  5. 5.
    Blum M. A machine-independent theory of the complexity of recursive functions. J ACM. 1967;14(2):322–36.CrossRefGoogle Scholar
  6. 6.
    Blum M. On effective procedures for speeding up algorithms. J ACM. 1971; 18(2):290–305.CrossRefGoogle Scholar
  7. 7.
    Butz M. How and why the brain lays the foundations for a conscious self. Constructivist Found. 2008; 4(1):1–14.Google Scholar
  8. 8.
    Cantor G. Über eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen. Crelle’s Journal für Mathematik 1874; 77:258–63.Google Scholar
  9. 9.
    Chaitin GJ. A theory of program size formally identical to information theory. J ACM. 1975; 22:329–40.CrossRefGoogle Scholar
  10. 10.
    Clocksin WF, Mellish CS. Programming in Prolog. 3rd ed. NY: Springer-Verlag; 1987.Google Scholar
  11. 11.
    Cramer NL. A representation for the adaptive generation of simple sequential programs. In: Grefenstette, JJ, editor, Proceedings of an international conference on genetic algorithms and their applications, Carnegie-Mellon University, July 24–26. Hillsdale, NJ: Lawrence Erlbaum Associates; 1985.Google Scholar
  12. 12.
    Crick F, Koch C. Consciousness and neuroscience. Cerebral Cortex. 1998;8:97–107.PubMedCrossRefGoogle Scholar
  13. 13.
    Fitting MC. First-order logic and automated theorem proving. Graduate texts in computer science. 2nd ed. Berlin: Springer-Verlag; 1996.Google Scholar
  14. 14.
    Gödel K. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik 1931;38:173–98.CrossRefGoogle Scholar
  15. 15.
    Haikonen P. The cognitive approach to conscious machines. London: Imprint Academic; 2003.Google Scholar
  16. 16.
    Heisenberg W. Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik. Zeitschrift für Physik 1925;33:879–93.CrossRefGoogle Scholar
  17. 17.
    Hochreiter S, Younger AS, Conwell PR. Learning to learn using gradient descent. In Lecture Notes on Computer Science 2130, Proceedings of the international conference on artificial neural networks (ICANN-2001). Heidelberg: Springer; 2001. p. 87–94.Google Scholar
  18. 18.
    Hofstadter DR. Gödel, Escher, Bach: an eternal golden braid. NY: Basic Books; 1979.Google Scholar
  19. 19.
    Holland JH. Properties of the bucket brigade. In: Proceedings of an international conference on genetic algorithms. Hillsdale, NJ: Lawrence Erlbaum; 1985.Google Scholar
  20. 20.
    Hutter M. Towards a universal theory of artificial intelligence based on algorithmic probability and sequential decisions. In: Proceedings of the 12th European conference on machine learning (ECML-2001); 2001. p. 226–38 (On J. Schmidhuber’s SNF grant 20-61847).Google Scholar
  21. 21.
    Hutter M. The fastest and shortest algorithm for all well-defined problems. Int J Found Comput Sci. 2002;13(3):431–43 (On J. Schmidhuber’s SNF grant 20-61847).CrossRefGoogle Scholar
  22. 22.
    Hutter M. Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures. In: Kivinen J and Sloan RH, editors. Proceedings of the 15th annual conference on computational learning theory (COLT 2002), Lecture Notes in Artificial Intelligence. Sydney, Australia: Springer; 2002. p. 364–79 (On J. Schmidhuber’s SNF grant 20-61847).Google Scholar
  23. 23.
    Hutter M. Universal artificial Intelligence: sequential decisions based on algorithmic probability. Berlin: Springer; 2004.Google Scholar
  24. 24.
    Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J AI Res. 1996;4:237–85.Google Scholar
  25. 25.
    Kolmogorov AN. Grundbegriffe der Wahrscheinlichkeitsrechnung. Berlin: Springer; 1933.Google Scholar
  26. 26.
    Kolmogorov AN. Three approaches to the quantitative definition of information. Probl Inf Transm. 1965;1:1–11.Google Scholar
  27. 27.
    Lenat D. Theory formation by heuristic search. Mach Learn. 1983;21.Google Scholar
  28. 28.
    Levin LA. Universal sequential search problems. Probl Inf Transm. 1973;9(3):265–66.Google Scholar
  29. 29.
    Levin LA. Laws of information (nongrowth) and aspects of the foundation of probability theory. Probl Inf Transm. 1974;10(3):206–10.Google Scholar
  30. 30.
    Levin LA. Randomness conservation inequalities: information and independence in mathematical theories. Inf Control. 1984;61:15–37.CrossRefGoogle Scholar
  31. 31.
    Li M, Vitányi PMB. An introduction to Kolmogorov complexity and its applications. 2nd ed. NY: Springer; 1997.Google Scholar
  32. 32.
    Löwenheim L. Über Möglichkeiten im Relativkalkül. Mathematische Annalen. 1915;76:447–70.CrossRefGoogle Scholar
  33. 33.
    Mitchell T. Machine learning. NY: McGraw Hill; 1997.Google Scholar
  34. 34.
    Moore CH, Leach GC. FORTH—a language for interactive computing. Amsterdam: Mohasco Industries Inc.; 1970.Google Scholar
  35. 35.
    Newell A, Simon H. GPS, a program that simulates human thought. In: Feigenbaum E, Feldman J, editors. Computers and thought. New York: McGraw-Hill; 1963. p. 279–93.Google Scholar
  36. 36.
    Penrose R. Shadows of the mind. Oxford: Oxford University Press; 1994.Google Scholar
  37. 37.
    Popper KR. All life is problem solving. London: Routledge; 1999.Google Scholar
  38. 38.
    Rice HG. Classes of recursively enumerable sets and their decision problems. Trans Am Math Soc. 1953;74:358–66.CrossRefGoogle Scholar
  39. 39.
    Rosenbloom PS, Laird JE, Newell A. The SOAR papers. Cambridge: MIT Press; 1993.Google Scholar
  40. 40.
    Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3:210–29.CrossRefGoogle Scholar
  41. 41.
    Schmidhuber J. Evolutionary principles in self-referential learning. Diploma thesis, Institut für Informatik, Technische Universität München; 1987.Google Scholar
  42. 42.
    Schmidhuber J. Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem. Dissertation, Institut für Informatik, Technische Universität München; 1990.Google Scholar
  43. 43.
    Schmidhuber J. Reinforcement learning in Markovian and non-Markovian environments. In: Lippman DS, Moody JE, Touretzky DS, editors. Advances in neural information processing systems 3 (NIPS 3). San Francisco, CA: Morgan Kaufmann; 1991. p. 500–6.Google Scholar
  44. 44.
    Schmidhuber J. A self-referential weight matrix. In: Proceedings of the international conference on artificial neural networks. Amsterdam: Springer; 1993. p. 446–51.Google Scholar
  45. 45.
    Schmidhuber J. Discovering solutions with low Kolmogorov complexity and high generalization capability. In: Prieditis A, Russell S, editors. Machine learning: Proceedings of the twelfth international conference. San Francisco, CA: Morgan Kaufmann Publishers; 1995. p. 488–96.Google Scholar
  46. 46.
    Schmidhuber J. A computer scientist’s view of life, the universe, and everything. In: Freksa C, Jantzen M, Valk R, editors. Foundations of computer science: potential-theory-cognition, vol 1337. Lecture Notes in Computer Science. Berlin: Springer; 1997. p. 201–8.Google Scholar
  47. 47.
    Schmidhuber J. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Netw. 1997;10(5):857–73.PubMedCrossRefGoogle Scholar
  48. 48.
    Schmidhuber J. Algorithmic theories of everything. Technical Report IDSIA-20-00, quant-ph/0011122, IDSIA, Manno (Lugano), Switzerland. Sections 1–5: see [50]; Section 6: see [51]; 2000.Google Scholar
  49. 49.
    Schmidhuber J. Sequential decision making based on direct search. In: Sun R, Giles CL, editors. Sequence learning: paradigms, algorithms, and applications. Lecture Notes on AI 1828. Berlin: Springer; 2001.Google Scholar
  50. 50.
    Schmidhuber J. Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit. Int J Found Comput Sci. 2002;13(4):587–612.CrossRefGoogle Scholar
  51. 51.
    Schmidhuber J. The speed prior: a new simplicity measure yielding near-optimal computable predictions. In: Kivinen J, Sloan RH, editors. Proceedings of the 15th annual conference on computational learning theory (COLT 2002). Lecture Notes in Artificial Intelligence. Sydney, Australia: Springer; 2002. p. 216–28.Google Scholar
  52. 52.
    Schmidhuber J. Bias-optimal incremental problem solving. In: Becker S, Thrun S, Obermayer K, editors. Advances in neural information processing systems 15 (NIPS 15). Cambridge, MA: MIT Press; 2003. p. 1571–8.Google Scholar
  53. 53.
    Schmidhuber J. Towards solving the grand problem of AI. In: Quaresma P, Dourado A, Costa E, Costa JF, editors. Soft computing and complex systems. Coimbra, Portugal: Centro Internacional de Mathematica; 2003. p. 77–97. Based on [58].Google Scholar
  54. 54.
    Schmidhuber J. Optimal ordered problem solver. Mach Learn. 2004;54:211–54CrossRefGoogle Scholar
  55. 56.
    Schmidhuber J. Completely self-referential optimal reinforcement learners. In: Duch W, Kacprzyk J, Oja E, Zadrozny S, editors. Artificial neural networks: biological inspirations—ICANN 2005. LNCS 3697. Berlin, Heidelberg: Springer-Verlag. 2005. p. 223–33 (Plenary talk).Google Scholar
  56. 56.
    Schmidhuber J. Gödel machines: towards a technical justification of consciousness. In: Kudenko D, Kazakov D, Alonso E, editors. Adaptive agents and multi-agent systems III. LNCS 3394. Berlin: Springer Verlag; 2005. p. 1–23.Google Scholar
  57. 57.
    Schmidhuber J. Gödel machines: fully self-referential optimal universal self-improvers. In: Goertzel B, Pennachin C, editors. Artificial general intelligence. Berlin: Springer Verlag; 2006. p. 199–226. Preprint available as arXiv:cs.LO/0309048.Google Scholar
  58. 58.
    Schmidhuber J. The new AI: general & sound & relevant for physics. In: Goertzel B, Pennachin C, editors. Artificial general intelligence. Berlin: Springer; 2006. p. 175–98. Also available as TR IDSIA-04-03, arXiv:cs.AI/0302012.Google Scholar
  59. 59.
    Schmidhuber J. Randomness in physics. Nature. 2006;439(3):392 (Correspondence).PubMedCrossRefGoogle Scholar
  60. 60.
    Schmidhuber J. 2006: Celebrating 75 years of AI—history and outlook: the next 25 years. In: Lungarella M, Iida F, Bongard J, Pfeifer R, editors. 50 Years of artificial intelligence, vol LNAI 4850. Berlin, Heidelberg: Springer; 2007. p. 29–41.Google Scholar
  61. 61.
    Schmidhuber J. Alle berechenbaren Universen (All computable universes). Spektrum der Wissenschaft Spezial (German edition of Scientific American) 2007;(3):75–9.Google Scholar
  62. 62.
    Schmidhuber J. New millennium AI and the convergence of history. In: Duch W, Mandziuk J, editors. Challenges to computational intelligence, vol 63. Studies in Computational Intelligence, Springer; 2007. p. 15–36. Also available as arXiv:cs.AI/0606081.Google Scholar
  63. 63.
    Schmidhuber J, Zhao J, Schraudolph N. Reinforcement learning with self-modifying policies. In: Thrun S, Pratt L, editors. Learning to learn. Netherland: Kluwer; 1997. p. 293–309.Google Scholar
  64. 64.
    Schmidhuber J, Zhao J, Wiering M. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Mach Learn. 1997;28:105–30.CrossRefGoogle Scholar
  65. 65.
    Schmidhuber J, Graves A, Gomez F, Fernandez S, Hochreiter S. How to learn programs with artificial recurrent neural networks. Cambridge: Cambridge University Press; 2009 (in preparation).Google Scholar
  66. 66.
    Seth AK, Izhikevich E, Reeke GN, and Edelman GM. Theories and measures of consciousness: an extended framework. Proc Natl Acad Sci USA. 2006;103:10799–804.PubMedCrossRefGoogle Scholar
  67. 67.
    Siegelmann HT, Sontag ED. Turing computability with neural nets. Appl Math Lett. 1991;4(6):77–80.CrossRefGoogle Scholar
  68. 68.
    Skolem T. Logisch-kombinatorische Untersuchungen über Erfüllbarkeit oder Beweisbarkeit mathematischer Sätze nebst einem Theorem über dichte Mengen. Skrifter utgit av Videnskapsselskapet in Kristiania, I, Mat.-Nat. Kl., N; 1919;4:1–36.Google Scholar
  69. 69.
    Sloman A, Chrisley RL. Virtual machines and consciousness. J Conscious Stud 2003;10(4–5):113–72.Google Scholar
  70. 70.
    Solomonoff RJ. A formal theory of inductive inference. Part I. Inf Control. 1964;7:1–22.CrossRefGoogle Scholar
  71. 71.
    Solomonoff RJ. Complexity-based induction systems. IEEE Trans Inf Theory. 1978;IT-24(5):422–32.CrossRefGoogle Scholar
  72. 72.
    Solomonoff RJ. Progress in incremental machine learning—preliminary report for NIPS 2002 workshop on universal learners and optimal search; revised September 2003. Technical Report IDSIA-16-03, Lugano: IDSIA; 2003.Google Scholar
  73. 73.
    Sutton R, Barto A. Reinforcement learning: an introduction. Cambridge, MA: MIT Press; 1998.Google Scholar
  74. 74.
    Turing AM. On computable numbers, with an application to the Entscheidungsproblem. Proc Lond Math Soc Ser 2. 1936;41:230–67.Google Scholar
  75. 75.
    Utgoff P. Shift of bias for inductive concept learning. In: Michalski R, Carbonell J, Mitchell T, editors. Machine learning, vol 2. Los Altos, CA: Morgan Kaufmann; 1986. p. 163–90.Google Scholar
  76. 76.
    Wolpert DH, Macready WG. No free lunch theorems for search. IEEE Trans Evolution Comput. 1997; 1.Google Scholar
  77. 77.
    Zuse K. Rechnender Raum. Elektronische Datenverarbeitung 1967;8:336–44.Google Scholar
  78. 78.
    Zuse K. Rechnender Raum. Friedrich Vieweg & Sohn, Braunschweig, 1969. [English translation: Calculating Space]. MIT Technical Translation AZT-70-164-GEMIT. Cambridge, MA: Massachusetts Institute of Technology (Proj. MAC); 1970.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.IDSIAManno-LuganoSwitzerland
  2. 2.TU MünchenGarching bei MünchenGermany

Personalised recommendations