Optimistic AIXI

  • Peter Sunehag
  • Marcus Hutter
Conference paper

DOI: 10.1007/978-3-642-35506-6_32

Part of the Lecture Notes in Computer Science book series (LNCS, volume 7716)
Cite this paper as:
Sunehag P., Hutter M. (2012) Optimistic AIXI. In: Bach J., Goertzel B., Iklé M. (eds) Artificial General Intelligence. AGI 2012. Lecture Notes in Computer Science, vol 7716. Springer, Berlin, Heidelberg

Abstract

We consider extending the AIXI agent by using multiple (or even a compact class of) priors. This has the benefit of weakening the conditions on the true environment that we need to prove asymptotic optimality. Furthermore, it decreases the arbitrariness of picking the prior or reference machine. We connect this to removing symmetry between accepting and rejecting bets in the rationality axiomatization of AIXI and replacing it with optimism. Optimism is often used to encourage exploration in the more restrictive Markov Decision Process setting and it alleviates the problem that AIXI (with geometric discounting) stops exploring prematurely.

Keywords

AIXI Reinforcement Learning Optimism Optimality 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Peter Sunehag
    • 1
  • Marcus Hutter
    • 1
  1. 1.Research School of Computer ScienceAustralian National UniversityCanberraAustralia

Personalised recommendations