Delusion, Survival, and Intelligent Agents

  • Mark Ring
  • Laurent Orseau
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6830)


This paper considers the consequences of endowing an intelligent agent with the ability to modify its own code. The intelligent agent is patterned closely after AIXI with these specific assumptions: 1) The agent is allowed to arbitrarily modify its own inputs if it so chooses; 2) The agent’s code is a part of the environment and may be read and written by the environment. The first of these we call the “delusion box”; the second we call “mortality”. Within this framework, we discuss and compare four very different kinds of agents, specifically: reinforcement-learning, goal-seeking, prediction-seeking, and knowledge-seeking agents. Our main results are that: 1) The reinforcement-learning agent under reasonable circumstances behaves exactly like an agent whose sole task is to survive (to preserve the integrity of its code); and 2) Only the knowledge-seeking agent behaves completely as expected.


Self-Modifying Agents AIXI Universal Artificial Intelligence Reinforcement Learning Prediction Real world assumptions 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability. Springer, Heidelberg (2005)zbMATHGoogle Scholar
  2. 2.
    Hutter, M.: On universal prediction and bayesian confirmation. Theoretical Computer Science 384(1), 33–48 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Orseau, L.: Optimality issues of universal greedy agents with static priors. In: ALT 2010, vol. 6331, pp. 345–359. Springer, Heidelberg (2010)Google Scholar
  4. 4.
    Orseau, L., Ring, M.: Self-modification and mortality in artificial agents. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), pp. 1–10. Springer, Heidelberg (2011)Google Scholar
  5. 5.
    Schmidhuber, J.: Ultimate cognition à la Gödel. Cognitive Computation 1(2), 177–193 (2009)CrossRefGoogle Scholar
  6. 6.
    Solomonoff, R.: Complexity-based induction systems: comparisons and convergence theorems. IEEE transactions on Information Theory 24(4), 422–432 (1978)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Mark Ring
    • 1
  • Laurent Orseau
    • 2
  1. 1.IDSIA / University of Lugano / SUPSIManno-LuganoSwitzerland
  2. 2.UMR AgroParisTech 518 / INRAParisFrance

Personalised recommendations