Death and Suicide in Universal Artificial Intelligence

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9782)

Abstract

Reinforcement learning (RL) is a general paradigm for studying intelligent behaviour, with applications ranging from artificial intelligence to psychology and economics. AIXI is a universal solution to the RL problem; it can learn any computable environment. A technical subtlety of AIXI is that it is defined using a mixture over semimeasures that need not sum to 1, rather than over proper probability measures. In this work we argue that the shortfall of a semimeasure can naturally be interpreted as the agent’s estimate of the probability of its death. We formally define death for generally intelligent agents like AIXI, and prove a number of related theorems about their behaviour. Notable discoveries include that agent behaviour can change radically under positive linear transformations of the reward signal (from suicidal to dogmatically self-preserving), and that the agent’s posterior belief that it will survive increases over time.

References

  1. 1.
    Bostrom, N.: Anthropic Bias: Observation Selection Effects in Science and Philosophy. Routledge, New York (2002)Google Scholar
  2. 2.
    Bostrom, N.: Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford (2014)Google Scholar
  3. 3.
    Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer, Heidelberg (2005)MATHGoogle Scholar
  4. 4.
    Leike, J., Hutter, M.: On the computability of AIXI. In: UAI-15, pp. 464–473. AUAI Press (2015). http://arXiv.org/abs/1510.05572
  5. 5.
    Li, M., Vitányi, P.M.B.: An Introduction to Kolmogorov Complexity and its Applications, 3rd edn. Springer, Heidelberg (2008)CrossRefMATHGoogle Scholar
  6. 6.
    Martin, J., Everitt, T., Hutter, M.: Death and suicide in universal artificial intelligence. Technical report ANU (2016). http://arXiv.org/abs/1606.00652
  7. 7.
    Omohundro, S.M.: The basic AI drives. In: AGI-08, pp. 483–493. IOS Press (2008)Google Scholar
  8. 8.
    Soares, N., Fallenstein, B., Yudkowsky, E., Armstrong, S.: Corrigibility. In: AAAI Workshop on AI and Ethics, pp. 74–82 (2015)Google Scholar
  9. 9.
    Solomonoff, R.J.: Complexity-based induction systems: comparisons and convergence theorems. IEEE Trans. Inf. Theor. IT–24, 422–432 (1978)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Veness, J., Ng, K.S., Hutter, M., Uther, W., Silver, D.: A monte carlo AIXI approximation. J. Artif. Intell. Res. 40(1), 95–142 (2011)MathSciNetMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Australian National UniversityCanberraAustralia

Personalised recommendations